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CLASSROOM TEACHERS IMPROVE THE PERSONALITY 
ADJUSTMENT OF THEIR PUPILS 
CHARLES D. Fiory, ELtzABETH ALDEN 1 MADELINE SIMMOD 


Lau rence College 


Editor's note: Not all of the important outcomes of teachis ‘ 
lemic in character. The author presents important facts of the influence of 
eachers on the personality adjustment of pupils 
PERSONALITY development has been emphasized so much in recent year: 
that some writers would make it the major function of the school. The de 
gree of attention given to this aspect of development makes certain that the 

lern teacher has little possibility of escaping some of the issues connected 
with problems of pupil adjustment. It is well recognized that personality 
should maintain harmonious development with other aspects of the indi- 
vidual. Teachers also admit that personality evaluations are desirable and 
that problems in adjustment should be corrected when possible. There are, 
however, some doubts concerning the validity of the so-called personality 
tests and more doubts about the ability of the classroom teacher to deal with 
the complex problems of adjustment. It was the purpose of this study to 
determine whether the regular classroom teacher could bring about improve- 
ment in the personal adjustments of her pupils when diagnoses were made by 
a standardized personality test 


PROCEDURI 


As a part of a city-wide survey, in February 1940, every fourth-grade 
pupil in the Appleton, Wisconsin Public Schools was given the California 
Personality Test. This particular test makes possible a diagnostic profile for 
each child. One section of the profile purports to indicate how the child feels 
about himself, while the other section consists of components of social 


Miss Alden and Miss Simmons followed t | through ¢ hitth and 


sixth grades respectively, giving, scoring, and analyzing the tests 
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idjustment. The average of the two parts ts considered a measure of t! 
hild’s Total Personality Adjustment. An analysis of the data for the four 
grade revealed that children in Appleton were somewhat better adjusted tl 
ave! , having a median Total Adjustment at the 56th percentile. B 
here were 26 children representing about 10 per cent of the total gro 


below the 25th percentile in their adjustment. Individua 


diagnose nd profiles were made for each of these 26 children. The « 

mentary-school supervisor agreed to the proposal that the classroom teacher 

should attempt during the year to bring about as much improvement as px 


sible in the personality adjustment of these subjects. The complete diagnos 
and profiles were turned over to the respective fifth-grade teachers in Sep 
tember, 1940. The supervisor, who delivered the information to the teachers 
suggested that each teacher use her own devices to bring about better per 
sonality adjustment.! The teachers were informed that further tests woul 
be made to determine whether changes had occurred but they were assur 


that a failure to bring al 


out improvement would in no wise jeopardize the 
position. Everyone was aware that the program was purely experimenta 
The subjects were scattered throughout the seven elementary school build 
ings in the city 
These 26 children were not as a group especially low in either intelli 
gence or achievement. The median IQs on the California Mental Maturity 
Test were: Non-language 102; Language 110; and Total 105. The media: 
grade levels on the Progressive Achievement Test were: Reading Vocabu 
lary 5.63; Reading ¢ omprehension 5.37; Arithmetic Reasoning 5.05; Arith 
metic Fundamentals 4.85; Language 4.78; and Total Achievement 4.88. The 
grade norm at the time the achievement test was administered was 4.8. Thus 
it is clear that these 26 subjects were as a group low only in personalit 
levelopment 
Three of these pupils left the city during the school year 1940-41, bu 
3 were re-tested in March, 1941 and again in March, 1942. The results of 
the study therefore extend over the two-year span of the fifth and six grade: 
The sixth-grade teachers received the March, 1941 profiles in September as 


* Since significant gains seem to have resulted from the teachers’ efforts to improv 
the pers ility adjustment of the subjects of this study, there is an interest in the 
methods employed by individual teachers. Unfortunately no records of procedures 
were kept. The supervisor felt that the teachers would be more willing to cooperat 
in the study if detailed records were not required. A further study has been projected 


to deal specifically with this problem 
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PERSONAL Y ADJUSTMENT OF PUPILS 


had the fifth-grade teachers the previous year. Any improvement found 


i4Q 


the result of the individual classroom teachers’ efforts with no outside 


was th 
} 


help other than the diagnostic profiles from the test. Improvement was 


neasured by administering the same test on successive years 


had nothing to do with giving or scoring the tests 


RESULTS 


The mean results obtained at the three testing periods have been pre 


sented in Table I. These 23 children improved about 20 per entile points 


their personality adjustment during the first year and about 10 percentile 
points during the second year. They progressed from slightly below the 25th 


ing 


tile to slightly above the median percentile when measured by the 


Test Teast Teast 
s 4 ent 22. 1 1.13 47.0 4.2 25 & 4.50 
ent 2 1.44 45.5 3. 66 « 3.14 
Tota ent 22.3 = 73 12.9 « 2.70 2.1 = 3.33 
I! 
MEDIAN SCORES ON EVERY COMPONENT FOR THE THREE TESTS 
194 141 1942 
Variable I lest 
Self A stment 22. 1 1. 41 41.0 * 3.03 1 * 68 
Se t i 85 u 71 ‘ 
Pers 23. 6 61.7 65.0 
Pers 9. 8 25. 0 47.5 
Feeling I nging 13.2 42. 5 60.8 
Withdrawing Tendencies (Freedom fron 14.6 31.3 47.5 
Nervous Symptoms (Freedom from 9.7 15.8 30.6 
Social Adjustment. 25.5 = 1.81 45.0 = 4 66 63.6 = 4.01 
Social Standards 32. 5 64. 0 61.7 
Social Skills 32.9 35.0 48.8 
Anti-Social Tendencies (Freedom fron 18.4 45.0 39.3 
Family Relations 20.9 44.1 47.1 
Relations 13.1 31.7 35. 0 
unity Rel ns 29.3 45.0 63. 1 


| 
DS 
in 
TABLI I 
MEAN PERCENTILE RANKS IN THREE ASPECTS OF ADJUSTMENT MS 
* 
Total Adjustment ‘ = . 23.0 = 91 42.5 + 3.65 | 52.5 # 4.17 TeX 
“a | 
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California Test of Personality. A more detailed analysis of their improve 


nt has been presented in Table II. The gains on each part of the test 


heen presented graphically in Figure 1 


E. Withdrawinre Tendenct« 
(Freedom fron) 


TAL AD.'STMENT 


A. Standards 


Figure | Protle of Median Scores for the Three Tests 


It 1s clear from the data presented in Table II that significant impr 
ments in personality adjustment have been produced through the efforts 
/ the classroom teachers 
The Total Adjustment median improved from 23 to 52.5 during the 
two-year period. An increase of 29.5 + 4.27 percentile points with a criti 


ratio above 7 indicates a highly significant improvement. There is a s 


ree 
1 10 20 3 40 SO 60 70 80 1 99 
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PERSONALITY ADJUSTMENT OF Pl PILS 


cise in both Self and Social Adjustment; the critical ratios are above 


and 6 respectively. 
The larger gain in the fifth grade, about 20 percentile points, and the 


in in the sixth grade, about 10 percentile points, do not mean 


trade teachers were more effective than those in the sixth 


les. Two factors seem to influence the amount of gain in these two 


First, it is relatively easy to make progress with pupils so near the 
f the scale. Second, it seems likely that certain type problems and 
pects of personality can be improved more readily than can othe: 
or aspects of personality. This latter point ts well illustrated by 
to the profiles in Figure 1. Sense of Personal Worth jumped from 
the 24th percentile to the 62 percentile during the first year but gained only 
tile points during the second year. Freedom from Nervous Symp 
the other hand, improved only 6 percentile points in the first 


increased 15 percentile points in the second year. It seems reason 


ssume that a teacher would be able to build up a child's opinion of 


re readily than she could eliminate nervous symptoms, many o! 


which are likely habits of long standing. Another comparison reveals a simi 
Social Standards improved about 32 percentile points during the 
nd dropped slightly in the second year of the st idy, while Social 
Skills e only slight improvement during the first year but almost reached 
the : un percentile by the close of the sixth grade, A priori, it should be 


r to change standards than to change skills. It is interesting to note 
ilso, that Freedom from Nervous Symptoms and 5 hool Relations are the 
lowest aspects of adjustment at the end of the two-year experimental period 
However, these children as a group are now so well-adjusted that many of 
chers remarked as the final test was being administered, “Why bother 
with these children? They don’t need any more help. I have a number of 
hildren in my room who are much worse than they.” 
It seems significant that during the first year of the experiment the group 
sained on each part of the test except Self Reliance. It is also important 
that this aspect of personality 1s the only one for which the group was up to 
the norm at the beginning of the study. It seems probable that these chil- 
dren. were so poorly adjusted in other aspects of their personal development 

they had over-developed Self Reliance as a compensation. As correctiv« 
procedures made these children feel that they belonged to the group and that 
hey had some value in the group, the necessity for compensating by relying 
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y upon themselves had disappeared However, during the follow- 
{ their s periority in Self Reliance whil 
" he | hey had made in other aspects of their adjustment 
So of the studies on delinquent children have shown that Self Reliance 
is likely to be high in those individuals who are socially maladjusted. It may 
not be too dangerous to hazard the guess that some of these pupils wouk 
ha O1 serious behavior problems if their personal adjustment had 
gone unno { by their teachers 
By selecting subjects who were at or below the 25th percentile in Tota 
Adjustment the variability of the sample was necessarily restricted. Th 
tandard deviations for Self, Social, and Total Adjustment on each tes 
pr sented in Table Ill 
TABLE III 
) OR SELF, SOCIAL, AND TOTAL ADJUSTMENT 
1940 1941 1942 
\ Test Test Test 
8. 1. 20 32. 13 
\ 10. 3 26.19 27. 76 
i \ <0 20 23. 70 
The low s 1 of 5.20 percentile points for Total Adjustment at the 
C! 0 ly is explainable by the fact that 10 of the 23 subjects 
lad scores at the 25th percentile and none were below the 10th percentile 
At th 1 of the experiment 10 of the subjects were above the median { 
he test hile one was at the 90th percentile and 4 were still at the 10t! 
At the end of the sixth grade 12 of the 23 subjects were at o: 
bove the 70th 


vel in Social Adjustment. 


percentile in Self Adjustment, while 6 had attained a similar 


for 


ldren 


But, 


and then 


n 


variability reveals a condition suggestive of success 


and failure for others when teachers are left to the 


es in treating maladjustments 


If measures of central tendency alone 


Ss a criterion, the procedure would likely be considered highly 


individual analysis reveals that 5, or 22 per cent, of the 


al 


ry 


the group studied failed to improve o 


r actually regressed in their 


Two of the 5 improved the first year, 25 to 45 and 15 to 30 


opped back the second year. One child lost 10 percentile 
two ye if 


eriod and two failed to gain. There was no wa) 


pad 
| 
q 
A 
The increase im 
a 
| 


initial results to predict which children would improve und 


ers’ guidance. One child jumped from the 10th percentile to th 


percentile, while another child starting at the same level improved 
10 points These results suggest that early diagnosis and treatment by 
ym teachers must be supplemented by the services of specialists 1f 
provement is to be expected for all children. The fact that a child fails to 


jiately under teacher direction does not mean that hope should 

c 
rc one child who ranked at the 15th percentile in the fourth grade 
he same point going into the sixth grade but jumped to the 55th 


tile at the end of the sixth grade. Another child remained at the 


percentile for the first year but attained the 60th percentile during 
ynd year. These delayed gains suggest two possibilities: time in some 
ny be the major factor in improving personal adjustments, and 
pupil relationships are likely to affect significantly the amount of 
rved 

[here was no check to determine how much more rapidly the progress 
ve been if the methods of the teachers had been augmented by the 

f s. There was also no check to determine how much 1m 
vould have resulte 1 by growth without any help by the teacher 

of personality maladjustment fe ind in elementary schools are 

» complex that the regular personality tests do not make ade 

ite diagnosis. It is also quite likely that with these complex problen 


edial procedures should involve more than manipulation of the school 


nvironment.*- 


Some additional observations can be made when the amount of gain in 
lity status is compared with the intellectual and achievement levels 

hildren in the group Such comparisons have been summarized in 

V. The 23 subjects were divided into three sub-groups: (1) No gain 
in status; (2) Gains up to 25 percentile points; and (3) Gains of 

rcentile points or more 

The data in Table V reveal that children in this study have maintained 


t an average achievement level, which is somewhat below the median 


ievement of the children in the grades in which they have been placed 


*A further study has been planned to determine the types of personality problen 
classrooms that should be treated by the teacher and the types that must be 
immediately to outside specialists. An attempt will also be made to deter 

whether there are significant changes in personality patterns during late child 
t can be attributed to growth 
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Th » be a slight relationship between IQ level and the amount 
ts n | n ‘tus. our of the 5 children with no gains or with 


losses had IQs below 100, one was as low as 77, and another had an IQ 
of 88. Two pupils in the group with small personality gains had IQs of 84 


ind 98 respectively but the other two were 116 and 117. The sub group 


with the largest gains in personality scores has the highest mean IQ. While 
two pupils in this group fell below 100 there were 6 pupils with IQs above 
110. It seems likely therefore that the regular classroom teacher will be 


nost successful in her attempts to improve the personality adjustment of 


her brighter pupils. Intellectual slowness itself may be a factor holding back 


hose children in the group that failed to gain in personality status. 


TABLE V 
THe AMOUNT OF GAIN IN RELATION TO INTELLECTUAL AND ACHIEVEMENT LEVELS 
Sub-Groups N Mean Mean Grade 
IQ Equivalent 
No gain or loss in status 92 5.9 
Gaine up to 25 points i 104 6.2 
Gains of 30 points or mors 14 108 6.1 


CONCLUSIONS 
Che validity of personality tests, the lack of controls in the experiment, 
| the size of the sample may raise certain questions concerning the gen- 
ralizations from the results reported. However, the findings seem suggestive 
cnough to warrant the following conclusions: 

l. There was a significant improvement in the personality adjustment 
of the 23 pupils used as subjects in this study. Their statuts at the end of 
the sixth grade was represented by a median score in Total Adjustment 
which was normal for the general population. 

2. The gains observed were the result of the treatment prescribed by 
the regular classroom teachers without the help of specialists. 

3. The amount of gain was about twice as great the first year as in the 
second year 

i. Five pupils failed to gain or actually regressed in their personal 
uljustments suggesting the need for the services of child specialists with 
the most complex cases 

5. There seems to be a slight positive relationship between the level of 
intelligence and the amount of gain 

6. The subjects in this study have maintained a normal achievement 
record during the two years of the experiment. 
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\ TEST RELATING EDUCATIONAL THEORY AND PRACTICE 


C. ROBERT PACE* 
Wa ef D 


Editor te: One ot 
of bridging the gap between theory an 


f the relationship betwee ‘ phil phy of education and practice 


The author presents a 


WHEN students in a special pre-service program at Teachers College, 
olumbia University, were asked for suggestions for appraising their progress 
hey said that a problems or situations type of test which called for the re- 


ating of educational theory and practice would, in their opinion, be most 
ippropriate. The fairest test, they said, would be one which required them 
to apply their understandings of human growth and learning, and of dem- 

processes, to a tual teaching situations. Development of the ability 


ite theory and practice was, in fact. the basic objective of the special 


rvice program, for this program consisted of three major elements- 


eminar, divisional seminar, and practice teaching—each of which 


throughout the year Problems met in practice teaching frequently be- 


topics for discussion in both central seminar and divisional seminar. 


Each week students wrote brief descriptions of some of the problems they 
id encountered during their practice teaching and these accounts were 


S 


in 


d over to the divisional and central seminar leaders so that relationships 

nong the three elements of the program could be facilitated. 

To proceed with the preparation of a test in accord with students’ sug- 
gestions, the evaluation staff constructed a few sample items which were pre- 
sented to the faculty of the central seminar for approval in principle. With 
this approval a final examination of twenty five situations was constructed. 
Many of the situations described in the test were selected almost verbatim 


from students’ own statements of problems actually faced in their practice 
teaching 
Instructions for the test and three sample items are illustrated on the 


following pages. 
DIRECTIONS 
The actions of a teacher in a classroom reflect his philosophy of educa- 
tion. In this test you are given twenty-five situations with some characteristic 


* The advice and assistance of Dr. Irving Lorge and Mr. Chester Junek are grate 
fully acknowledged 


9g 


Comm Mea 
rtant probicms in ¢ lu 
it 
; 
1 
entr 
= 


7 CATIt iL RESEARCH 8, 
tions, Opinions, or judgments which might be taken concerning them. Each 
h action, Opinion, or judgment may be a reflection of one of the follow 
controlling points of view 
It may f either an autocratic concept of ition; or it may re 
flect a demo ncept of education 
It may refle 1 concept of learning which emphasizes the importance 
of discipline and 1 lastery of subject matter; or, a concept of learning which 
emphasizes the importance of students’ interests and problem-solving 
} cs 
It may reflect an organismic concept of human nature: or, an atomistic. 
ntellectualistic, or mental faculty concept of h in Ire 
Before each activity, opinion or judgment for each situation are three 
OF letters 
ADN MSN OFN 
You are to cross « the letter th epresents your judgment as to the con 
ng con ept involved, where 
\ indi ¢ ci ontre concept 1s 
D indicates ontrolling concept is de? 
N in t neither A 1 ) is involved 
licates ontrolilr conce} IS Mla or § bye t or dis 
S indicates the controlling concept 1 nde f 


N indicates that neither M nor S is involved 
O indicates the controlling concept is organi 
F indicates the controlling cor ept 1s mental faculty or atomistic 
N indicates that neither O nor F is involved 


You should cross out one letter in each group of three, and you should 


have a total of three letters crossed out for each activity. opinion or judg 


iy 


ment under each situation. Do not leave any group of items blank 

After you have classified the activities listed under each situation you 
icate acceptability for the actions; e.g., you are to mark with a plus 
(+-) all those activities that you consider acceptable or desirable, you are 
o mark with a minus (—) all those activities that you consider inadvisable 
or undesirable, and you are to leave blank those that you consider neither 
desirable nor undesirable. In other words, in the parenthesis at the right 


of the actions you are to put a plus (--) for desirable. a minus for 
| 


undesirable actions 


wy! 

, 

: 

4 
| 


, Ailene was a rather large, stolid, adolescent girl whose main L1demMic 
ffense was going to sleep in class. When not act y asleey n te 

: be in a daze. Her school marks were avera The followin eestior 
’ nade for handling this case 


h ADN MSN OFN_) 1. Give her an intelligence test Se 
ADN MSN OFN 2. Arrange for her to get more fresh air ( ) 


ADN MSN OFN_ 3. Send her to a doctor to have her thyroid (_ ) 
examined 
ADN MSN OFN 4 Report her indolet to her parents ( ) 
. ADN MSN OFN_ 5. Drop her from sch a 


No. 2 


The department ot education prescril es a sermes Of units in gvrammal 
Ihe teacher knows that her students do not like grammar. In introducing 


se units to her sixth grade class she uses the following approach: 
ADN MSN OFN_ 1. Today we study grammar. Get out your (_ ) 
notebooks and we will do some exercise 


ADN MSN OFN 


She gives them an objective test, has ( ) 
them score their own papers and drills 
them on the correct answer 
ADN MSN OFN_ 3. She says that we are all in the same boat. (_ ) 
For the next few days we have to study 
grammar. How shall we go about it? 
ADN MSN OFN 4. She gives them some exercises to do at ( ) 
home and then in class she discusses 
their work with them 


ADN MSN OFN_ 5. She uses history, a subject they like, as ( ) 


an excuse for getting them to write a 


short essay. Then she discusses gram- 
mar in their essays with the ind gets 


Sam ple No. 3 


A high school class was studying music appreciation. The young teacher 
was disturbed because the only kind of music the students seemed to like 


was swing music. After thinking about the problem she decided that: 


1944 EDUCATIONAL THEORY AN PRAC 
7 
h Sample No. 1 
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them to rewrite the essays. ; 
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ADN MSN OFN 1. She would take them to a symphony ( ) 
concert to hear some really good music. 

ADN MSN OFN_ 2. She would tell them stories about the ( ) 
lives of some of the great composers. 

ADN MSN OFN_ 3. She wouldn't play swing music forthem ) 
any more because they hear enough of 
it outside of class anyway. 

ADN MSN OFN_ 4. Some swing music was better than others (  ) 
and there was considerable opportunity 
for critical judgment within the single 
field of popular dance music. 

ADN MSN OFN 5. She would continue to play swing for ( ) 
them but would also try to get them 
to notice the kinds of music played in 
motion pictures 


In most situations or application tests students are merely asked to in- 
dicate which activity among those listed is most desirable under the circum- 
stances given. The present test does not ask students to choose the ‘‘correct”’ 
solution to a problem. Rather, it asks them to recognize the educational con- 
cept underlying each alternative activity. In effect, it asks them this ques- 
tion: If you do thus and so under these circumstances, with what under- 
lying philosophy is your action consistent? Thus, students must identify the 
controlling viewpoint in a variety of activities—some of which they may 
approve and some of which they may disapprove 

The concepts or points of view involved in the test are of fundamental 
concern in the education of teachers. The basic content of child and 
adolescent psychology is the description and interpretation of human nature; 
the basic content of educational psychology and curriculum centers around 
the nature of the learning process ; and the basic content in educational 
philosophy and sociology centers around the desirability of various forms 
of social organization and group relationships. Essentially the situations test 
is an attempt to measure the extent to which students see the practical im- 
plication ot these important theoretical concepts 

The test was given as a mid-year examinaion to 60 students in central 
seminar (a graduate group) and to 82 students in a special pre-service 
undergraduate seminar in Barnard and Columbia. Eighteen faculty members 
likewise took the test and their responses provided the basis for developing 
the scoring key. 
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The determination of a scoring key presented a problem. The usual 


way of scoring a test 1s to note the number of correct answers given 


twe 


rect being defined as agreement with the judges or experts. If the present 


were scored on the basis of agreement with the experts (taking 70 per 
more agreement among the experts themselves to establish an answer 
rect), only 44 per cent of the items in the test would be scoreable 
re are, incidentally, 500 items in the test: each of the 25 situations has 
alternative courses of action listed under it, and for each of the five 
es of action the student must make four choices—three in identifying 
controlling points of view and one to indicate the desirability or unde- 
bility of the action. A system of scoring which could be applied to only 
of the 500 possible items was, quite obviously, inefficient. A system 
levised, therefore, of scoring the tests on the basis of the number of 
enable positions chosen. An untenable position was defined as one seldom 
ver chosen as correct by the faculty judges. A system was worked out 
that there was a difference of approximately one standard deviation be- 
en a tenable and an untenable position. An untenable position was one 


sen by fewer than 15 per cent of the faculty judges provided the alter 


itive was chosen by more than 30 per cent; or one chosen by fewer than 


per cent of the judges provided the alternative was chosen by more than 


per cent. With this system of grading, 401, or 80 per cent, of the 500 


tems were scoreable. 
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The way in which the scoring key was prepared is illustrated further in 


le 
TABLE | 
NUMBER AND PER CENT OF JUDGES’ RESPONSES TO SAMPLE TEST 
SITUATION NUMBER THREI 
Number* of judges responding Per cent of judges responding 
A D N M S N 0 F N A DN M § N O F N 
13 0 5 | 13 1 3 0 4 9 72 0 28 76 6 18 0 50 50 
6 0 12 ll 0 6 0 iil 33 0 66 65 0 85 0 61 39 
12 0 6/13 0 4 0 8 10 66 0 33 6 0 24 0 44 55 
1 9 8 1 14 2 11 0 7 6 60 44 6 82 12 61 0 39 
3 10 &6| 2 14 1 13 1 4 17 65 28 12 8&2 6 | 72 6 22 
* Occasionally one or more judges omitted an item and thus the total number of judges does not 


sequal 18 
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m 1 in this situation, 13 judges, or 72 per cent, said that the 

por take them to a symphony concert’’, reflected an autocratic con 

of education; no judge said it reflected a democratic concept; and 5 

es, or 28 per cent, said it reflected neither a democratic nor an auto 
concept. Reading across the top row of the table, one can see further 
hat this response was believed by most of the judges to reflect an emphasis 

n subject matter mastery rather than student interest, and a mental faculty 

r than an organismic concept of human nature. In item 2 the response, 

| them stories about the lives of some of the great composers’, was con 

lered by most of the judges to reflect neither an autocratic nor a demo- 

concept; but since more than 30 per cent said it was autocratic and 

than 15 per cent said it was democratic, the response democratic was 

illed untenable. While for many of the choices in these 5 items there is 

no clearly correct answer, there is always one that is clearly incorrect or 
Tl ntenable responses to the three test samples reproduced above 


Sample 1 Sample 2 Sample 3 
I I Oo D Ss oO 
\ M s Oo D Ss Oo 
M | D Ss 0 
S M \ M F 


The blank spaces in samples 1 and 2 indicate that no response could 
1e item was not scored. It is apparent that 
this method of scoring is extremely liberal: students are penalized only for 


th the faculty judges, and the response “‘N”’ is 


never considered as untenable. The answers to the part of the test which 
sks students to jud he various activities as desirable or undesirable are 
not shown here. This aspect of the test was included tentatively only so 
orrelations ld be run between those judgments and students’ ability 
to identify practices with controlling viewpoints 
Analyses of students’ papers indicated that the test possessed adequate 


iability, validity, and discriminating power Reliability was determined 
very crudely and only for Part I of the test. (Scores were added separately 
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- ) parts of the test—Part I being those items calling f: ognitior 
f controlling viewpoints, and Part II bein ms ne 
ncerning the desirability or undesit tv of the activities.) Stud 
es on the 13 odd-numbered situation pared witl on 
even-numbered situations with a result efficient of 82. based of 
5 ises from the central seminar. If reliability had been determined bv cot 
of response among 125 activities rather than among 25 situation 
ting coefficient would approximate .9 
\ comparison of mean scores le by tl ¢ lergradua 
on both parts of the test revealed that the graduate p made 
tly (critical ratios greater than 3.00) fewer untenable choices th 
rgraduate group. This is certainly consistent with 1 Its one wo 
» find. The analysis is shown in Table II 
I Il 
Co M{ ( 
( j 
Gradu grouy { graduate inderg 


20. 10 10. 97 41.1 l } 

15. 81 19 

3-87 2-24 2 38 
ho 


Within each group the relationship between the scores on the two part 
test was computed. For the graduate group the resulting coefficient 
rrelation (uncorrected ) was .486, and for the undergrad roup t] 

relation (uncorrected) was .445. While one can say in general that thos 
nts who are best able to recognize the con epts underlying specific pr 


ices are aiso Dest able to recognize the desiral 


ility of those specifi practice 
me would probably not be justified in combining Parts I and II to obta 
single total score 

A number of analyses were made which relate directly to the instr 
ional program. A sample of 50 papers from the graduate er p was anal 
to reveal the types of errors made most frequently in the test. On Part | 


roughly one-third of the errors could be confusions between A and D cor 
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cepts; another third could be confusions between M and §S concepts; an 
another third could be confusions between O and F concepts. Actually, 


per cent of the errors involved A and D, 37 per cent involved M and S$ 
and 42 per cent involved O and F. Comparison of the difference betwee 
these obtained and theoretical frequencies shows it to be significant at th 
5 per cent level. Thus, students on the whole did relatively well with the 
A and D concepts, and experienced most difhculty distinguishing betwec 
the O and F concepts. Further analyses were n de of the relative frequen 
of errors within each of the three main opposing concepts These show 

that the A and D concepts were of about equal difficulty, that the S conce; 
caused proportionately more difficulty than the M concept, and that activitic 
involving the F concept proved more difficult to identify than ones involvin 
the O concept. Analyses of individual papers identified those concepts whic! 
each person was having most difficulty in understanding. For example, th 
errors of some students were spr« id fairly evenly over the 3 main concepts 
for other students, errors were concentrated in a single category. Inform 
tion of this sort can be used not only to focus general instruction mor 
sharply but to help in lividual students straighten out their chief misconcey 
tions. The analysis of groups does not provide an adequate frame of refer 


ence from which to discuss the needs of individuals 

Item analyses of the responses oi the highest 10 and the lowest 10 stu 
dents in the sample of 50 stu lents from the central seminar served to iden 
tify a number of weaknesses in the test. The most obvious weaknesses were 


that a relatively large number of items did 


that the test was too easy 
not differentiate between the responses of best and poorest students. Both 
these weaknesses result chiefly from the generous method of scoring which 
was used: students were penalized only for gross errors—v.e., for selecting 
clearly untenable answers. This weakness is in large measure compensated 
for by the fact that the test was scored for 500 items rather than just 25 
situations or 125 activities. With such a large number of items (even so, 
students finished the test in an average of one hour) the total test can be 
reliable and discriminating even when many items contribute little or noth 
ing to the total score. However, as a result of item analysis, it was possible 
to improve the test by reducing the number of situations from 25 to 16 
and by eliminating Part II entirely. In the original test, Part I, there were 


375 items, of which 295 (79 per cent) were scoreable, and 170 (45 pei 
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) were differentiating. In the revised test of 16 situations there were 


10 items of which 207 (86 per cent) were scoreable and 132 (55 per cent) 


While the improvement of the test as a technical instrument for future 


1 
n aoubt important, perhaps more important from the stand 


lation program which is an integral part of instruction is the fact 

he test was developed out of suggestions which came directly from stu 
dents, that the situations described in the test came directly from experiences 
idents had had in their practice teaching, and that a notable de 


ilty cooperation and assistance was involved in the development 
yf nstrument. If one accepts the viewpoint that evaluation activities 
promote understanding and insight as well as provide estimates of 


Or progress it seems axiomatic to say that those who have an important 


take in the outcome (students and staff alike) should | lay an important part 


che process 


Equally important, too, is the suggestive value of this techni jue of test 
i r measuring some fundamental objectives in teacher education. It 


alin 
is the necessity of labeling educational! 


practices as good or bad. Rather, 


es students with a variety of practices. and simply asks them to identify 


the concepts or principles with which they are consistent. And the 


instruc 
value of thus associating theory and practice should prove helpful 


to students 
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. I tendency peo} est 1 tant decisions 
ipon insufficient data, estimates and guesses has been frequently noted. The 
uthor surveys this tendency with a number f very pertinent comments 
[oO EVALUATE 1s to ascertain the value of, to reckon up the worthwhile 

ness tne gooaness Or Daunes OL process or hing Thus education 


evaluation involves the passing of judgment on the degree of worthwhil 
ness of some teaching process or learning experience. As such it is inherent 
in the planning of next steps im any educati nal enterprise, and its legitimat 


function is as a part of this process of planning 


Always involved in such evaluation are two factors, the phenome: 
observed and the observer's set of values. That is to say what one lab« 
good” or “bad” in an educational enterprise and the degree of worth whi 
is assigned to it depends on such factors as the observer's ideas of the natur 
of learning and of the purposes of education as well as on what is observe 
or counted. In fact, one's philosophy of education, the scale of values 
which one has. usually determines what will be observed or counted. H 
ever, the appropriateness and dependability of the observations or coun 
which one makes when examining an educational endeavor are also essentt 


factors in the soundness of the judgments made, and the important cor 


tribution of the scientific method to education has been the making availab 


of more appropriate and dependable data from which to pass judgments 


The writer does not propose here to defend this concept of evaluatior 


Enough to point out that it represents a process which is constantly going on 
and one in which our hope of educational improvement lies. The fact is tha 
when a teacher decides to require two term papers instead of one or to allow 
her students a larger share in deciding what they will study and how they 
will study it, when a principal decides to expand the mathematic | offering 
in his school or to provide work experiences for his pupils, when a grou} 
of teachers decide to undertake the “core” curriculum, to individualize in 
struction or to be “stricter” with their discipline, when the patrons of a 
school decide they don't like “progressive” education and bring pressure t 


hear to get it out of their school, then evaluation, as conceived here, ha 
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been done. This is true whether the decisions are made on the basis of 
evidence or not 

To one who accepts this definition of evaluation the role of educational 

vents 1s clear. Measurement means the counting of, a numerical 

ion for, something: but implicit in its use is the assumption that 

the measurement we have more appropriate and more dependable 

tion on this something. Although educational measurements do not 


it themselves constitute evaluation, the reason for measuring is to 


get 
at 


data for evaluation. Thus good measurement is measurement which con 


tes to intelligent evaluation 


nfortunately, educational measuremcnts, as currently conceived, fur 
nish inadequate data for evaluation. That this statement is true will, we 
hop ecome evident as we examine into the information needed in ordet 
to make a valid judgment in education. No matter what assumptions con 
the nature and purposes of education we may make, it is the writer's 
opinion that data of the following sorts are necessary for intelligent evalua 


tion of any educational endeavor 


1. We need dependable and complete information on the purposes of, 
he “methods of control employed by, the director of learning—the 
r Or experimenter 

We need dependable and complete information on the needs of, 


the purposes of, and the methods used by the learner 


3. We need dependable and complete information on the outcomes 


ident in the learner as the process continues and of the outcomes th 
nay reasonably be expected to accrue as a result of the experience. 

Since people in education are constantly passing judgment on the basi 
ot much less information than is indicated by the above enumeration, in fact 
ire scarcely conscious of the need for any such comprehensive data, we shall 
attempt to justify the position taken Perhaps this can be done, along with 
making clearer some of the implications involved, through illustration. 

How worthwhile was a certain course in remedial reading? This ques 

on is typical of many that face school people, and an intelligent answer to it 
should be had before any further steps concerning the course were taken 
If the question were answered as many educational measureres would have it 
done, it would be decided on the basis of an incomplete, often unrealistic, 


statement of purpose and a very cursory description of the teaching methods 
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upplemented by the results from standardized reading tests giver 
nally. But to answer the question on such evidence in 
he acceptance of a long series of questionable assumptions. Actually 
could intelligently evaluate the reading course, many other que 

15 such as the following would need to be asked 


First, questions relating to the teacher's purposes and methods: Why 
I 
teach the course? Who decided that it should be offered? What were 
hoped-for outcomes? How did the teacher plan to measure success an 
I 


fatiure What « hanges took place tn her p irposes as the course proceeded 
i I 


What role did she play in the class situation? What was the quality, intensity 
ind consistency of control? What methods of planning were used? Whi 


made what decisions? What types of motivation and of restraints were em 
yed? What provisions were made for individual differences in purpose 
ind in ability? What methods of testing and grading were used? Whe: 
were em} hases placed? etc 


Secondly, questions relating to the pupils’ needs, purposes and methods 


I 
How badly did the students need training in reading? What other need 
were being neglected in order to meet this need? How would the time have 


been used if they had not taken this course? Why did they enter the course 
What were their own reasons for entering it? What changes in purpose too! 
How did they propose to measure success? How dependent on the 


teacher were they? Were their methods such that transfer was at a maximun 


1 growth could be expected to continue? etc. 
And finally questions relating to what taking the course has meant and 
lay reasonably be expected to mean to the pupils What outcomes were 
evident to the teacher and to the pr pils as the course developed ? What in 


creases in reading skill? What other desirable outcomes? What undesirable 
outcomes ? How lasting could we expect outcomes evident at the termination 


of work to be? What chance was there that growth would continue? etc 


It is only in the light of comprehensive information suggested by the 
ibove questions that we can intelligently evaluate the reading course, and 
this 1s true regardless of the assumptions we accept concerning the purpose 
of education. Let us examine the matter from the viewpoint of two extrem¢ 
theories. If the p irpose of education is to prepare for living, then the course 


would be judged to have worth if one could safely predict future value 


uccruing to the learner from the experience. Would anyone deny that the 
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yoses and methods of the teacher would influence the retention and us« 


ind later growth in, reading skill? Not only the stated purposes but those 


plicit in her conduct of the class. and not only the gross elements of 
ethod but the more subtle aspects such as the met! of management 
tt sphere of the s ition, the ont j n 
And just as important are the purposes and methods which rn¢ 

has and acquires. And what of the pupils’ other nec Pres ly, tl 
e serious and pressing enough to justify labeling time spent in learnin 

as wasted. Or, would anyone contend that initial and terminal t 

cores, two points on a growth curve which we hope will contin Over 
ong time, are adequate for predicting the future? And what of ot f ri 
ings of a personal and social nature? Should this course contribute its share 
toward developing the intelligence of the pupils? If so, then it is impr it 


that we know who made decisions concerning what to do. 
what success was had. It is also important that we collect signs of pr 


ck of progress in this direction. Or, has the course prepared the students 


for becoming respected and self-respected citizens? To the extent that they 


have learned to cheat, or loaf. or depend unduly on the teacher and her 


inspiration, or to think of themselves as mal-adjusted and not very worth 
while persons, there is serious question concerning the worth of the course. 
It is doubtful that anyone would contend that all such factors (and ther 
ire many others) are reflected in the difference between the initial and final 


score on a standard reading test, If we ire to judge the preparatory value of 


il 


th irse more complete information ts needed 


Chis is just as true if we conceive of education as the process of intel 


ligent living. If we assume that the value of an educative proc is to be 
sought in terms of the completeness and worthwhilenesss and intel ’ 
of living during and in the process, then the purposes and methods of th 
teacher and the learner become even more imporant and outcomes are ol 


served and counted primarily because they serve as on basis for judging tl 


dequacy of purpose and the appropriateness of procedures. But data of th 
three sorts postulated above would still be needed. To one who accepts such 
a theory of education, was the remedial ling course worthwhile? If th 


reader will ask himself concerning the need for the kinds of information 
called for in the three groups of questions listed a few paragraphs above, in 


most cases it will be obvious that such information is necessary for intel 
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judgment. In a few cases, particularly where the questions are pointed toward 
the future, this may not be so obvious, but it is just as true. Intelligent livins 
now, the determination of what is best done today, demands evidence o 
what doing these things is meaning and may reasonably be expected to mea 
to those who do them 

There may be differences between adherents to these two theories of 
education and among those accepting various positions between the tw 


extremes, as to where emphases should be placed differences as to wh 


should collect information, as to why it should be collected, as to the relatiy 

importance of various bits of information. and as to the uses that should be 
made of it: but in any case sound evaluation is based on data of the con 


prehensiveness suggested here 
And so it is with many real problems which face school people. De 
| 
sions involve evaluating alternate courses of action and making a choice, an 


wise choices are made only when we have at hand data adequate for judgin; 


When judging an educational endeavor, appropriate data include com} let 


nd dependable information on the ma/ien into which the learner ts place 


and the reasons for placing him there, and on the needs and purposes « 


pu 
nd the methods employed by the learner, as well as information on th 
outcomes observable in the learner as the experience continues 


But such complete information ts typically not available—and primarily 


I think, because those charged with responsibili for collecting adequate dat 


for evaluation, the educational measurers. have not bothered themselves witl 


its collection. Too much. those of us working in the field of tests and meas 


urements have been concerned with developin > more and more refined instru 


ments for measuring the outcomes of learning immediately observable in stu 
dents, while we lack even the crudest measures of other aspects of the 
teaching-learning situation, aspects which are just as essential to the intel 


ligent evaluation of an educational endeavor. Should not a part of our en 
ergies and skill be directed toward developing dependable and_ vali: 
descriptions of these other essential aspects? I do not for a minute doubt 
that they can be developed. Many of the descriptions can probably be re luce 

to quantitative terms. In some cases usable instruments are already available 
It is possible, for example, to numerically describe a large part of the needs 
of pupils even now. Purposes, too, at least in specific situations, could be 


reduced to quantitative terms And techniques, statistical and otherwise 
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\ h are known and are now in use would be readily adaptable to 


wider uses. Suppose, for instance 
ging the worth of items in a standar 


thev were indicative of retention and later growth tn th 


we want more dependable de criptions of some of the 
sroom environment. Such problems we already kr 
No, the trouble is not a lack of “know-how” but a 
lea is that we develop such concern. Obviously it 
tioningly accept a series of assumption which f 
that if students do better on this or that nevemer 
h, and therefore might well be repeated and d 
t rth of past experience 1 plan for the € Of 


testing of some of the outcomes evident at a 


ife: but it 1s not thro h such a t t 
their legitimate role in educational eval When 
tackle the job of furnishing complete and dependal 


essential aspects of the teaching-learning 


which 1s essentially ours 


lecisions in education—laymen, school administrators, 
literally forced to decide on the basis o pr 
nts’, “hunches”, and ‘common sei pplem 


ition which we now furnish them 


The ideas concerning educational evaluation and 


xpressed here have grown out of the writer's expe 


of years in working with the Southern Asso on 
nstantly compelled to participate in making «de 
i 


provement of teaching-learning situations and where 


k of adequate data for doing the job intelligently 
s often taken part in ‘sto k-takine™” which \ 


ntelligent planning of next steps in an ¢ 


measurements 


many dimensions of 


yw how to ipproa 
lack of cer ind 
is casier to noauc 
t! n 
! ence | 
exper nce Nas 
plicated, or to judg: 
the basi I Cros 


yarticular moment in 


y when, we 


which 


lences Over a period 


Study, where he wa 


1 to contribute t 


these ittempts have been reduced to writing and publish A cri 
‘See for example: Frank C. Jenkins and others, “The Southern As tion Study 
( ssion on Curricular Problems and Research the Southern A tion of C 
1 Secondary Schools, Nashville, Tenn.. 1941: Verner M. Si S r Stud 
tor School Groups”, Bureau of Educational R irch, University, Alal 1942 } 
Verner M. Sims, “Problems Relating to Effective Study for Works! Pa pant 


ithern Association Quarterly, VII:4, November, 1943, pp. 41 
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examination of these reports reveals them as efforts to make available mor 
ot t data essential tor passing judgment than are commonly furnished 
but they Only crude beginnings They lack objectivity, quantitative re 
hinemet ind cor iprehensiveness; often they are not too realistic. A part 
their weakness is undoubtedly the result of limitations in the working sit 
uation, perhay nt workers themselves: but all such reports, and tl 
rocess of evaluation itself, will continue to be unsatisfactory until we ir 
rements have, o i | [ ti brought to he r our technig s, oO 
kill, and our intelligence o 1c total job of furnishing adequate data fi 
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FORECASTING ACHIEVEMENT IN ELEMENTARY ALGEBRA 


Miams l y, OX ©) 
Editor's Note: It has been frequently | 1 out that t : 
need bette 1S tf apt t 
¢ i migner 1 
NVASS of tl ite e of ed ( | research of the last two 
( ls a large number of pr M pred re 
o have been done in elementary algebra than in any other 
1€ he prol e reas eing the ut lly high percenta of 
r heoinner nm tl } } 
ong beginne! he t. in | nnit 
e characterized by m 1 variation (a) in the types of instru 
ployed for n ng achievement, (b) in the type nstrument: 
€ pre licting achievement, and (c) in th predictive ethciency of th 
orecasting inst ents. Because of the marked variance in the crit 
r ployed in these studies both to pred t achievement and to measure 


nclusive nature of the pre 


ent, and especially because of the incor 
e findings, further study of the probler f prognosis in beginning 
cel eem tified 
THE PRO 
[he purpose of the study reported in this article was to discover (a) 
omparative value of certain standardized measures of algebraic aptitude 
yf arithmetical achievement, and of initial alae ic achievement in fore 
sting success in elementary algebra and (b) the extent to which this 
s can be forecasted from various combinations of these three predictive 
sure 


Subjects Used. The study included seventy-fiv pupils who were en 
rolled in the ninth grade in the public schools of Marion, Ohio. Th pupils 
vere taught by two teachers and were taking their first cour Ieebr 

lat 1 of scores derived 


rt 


Source of Data. The achievemen 


l Ol 


Form B of the Breslich Algebra Survey Test, which was given the end 
4 the semester's work in algebra. The predictive data consisted of score 

rived from three tests given at the be ng of the set er. TI wert 
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the Iowa Algebra Aptitude Test (Revised edition),' the Christofferson 
Rush—Guiler Analytical Survey Test in Computational Arithmetic? and For: 

A of the Breslich Algebra Survey Test 

Procedures. Two procedures were employed in treating the test dat 
One consisted in the use of correlation techniques for discovering relatior 

& 

ships between the predictive measures and semester achievment in algebra 
The Pearson product-moment method was used in computing the correla 
tions. The other proced re consisted in the use of a plan for discovering th 
extent to which the pupils’ semester placement in achievement was fore 


casted by their placement on the predictive measure 


Correlation Coefficients. A summary of the zero order, partial, and mu! 
tiple coefficients of orrelation, which were computed from the test data, 1 
presented in Table I. The zero order coefficients are listed in column 1. TI 
upper three zero order coethcients show the correlations between the pi 

tive factors and semester achievement in algebra; the lower three coeffhicier 
show the intercorrelations of the predictive factors. All of the zero ord 


correlations were positive and were more than ten times the size of th 
probable error 
Reference to the top item in column | will show how the zero ord 


scores ( Factor 2) 


coefhicients should be read. Thus, algebra uptit Ter 
ra (Factor 1) to the extent o 


correlated with semester achievement in agel 
in r of .775. Comparison of the upper three zero order coefficients show 
that scores on the algebra aptitude test forecasted achievement better thar 


scores on the other two predictive measures, and that scores on the initial tes 


in algebra had slightly greater predictive potency than scores on the test it 


computational arithmetic. The size of the upper three coefficients indicate 
that each predictive measure is significantly related to algebraic achieve 


ment: moreover, the close agreement in the size of the intercorrelations indi 
cates that the three predictive instruments are measuring something in con 
mon, probably certain mathematical skills and relationships. The efficiency 
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of the predictive measures may be estimated trom the formula, 


l 
E ‘ \ , where E represents the percentage ot improvement in 


cy of prediction ove: chance.t Application of the formula to the 


three zero order coefficients in the order given yields E's of .55, .50, 


i TABLE | 


ZERO ORDER, PARTIAL, AND MULTIPLE COEFFICIENTS OF CORRELATION 


Factors Involved in the Study 


End of the semester scores on the Breslich Algebra Survey Test, Form B (Achievement Factor 
Scores on the Iowa Algebra Aptitude Test (Predictive factor 
Scores on the Christofferson-Rush-Guiler Analytical Survey Test in Computational Arithmetic 
edictive factor 
4. Scores on the Breslich Algebra Survey Test, Form A (Predictive factor 
mn 1 ( imn 2 Column 3 
| 
rs Order Factors Partial Factors Multiple 
. Corre- Corre- Corre- Corre- Corre- 
ed lations lated lations lated lations 
2 775 12.3 2 1 (23 83 
; 707 12.4 9 1 (24 802 
‘ 731 12.34 45 1 (34 793 
1 (234 845 
3 740 13.2 32 
24 625 13.4 46 
4 644 13.24 19 
14.2 10) 
) 14.3 51 | 
14.23 44 | 


The coefhicients of partial correlation are listed in column 2 of Table I. 
[hese show the net correlation between semester achievement in algebra and 
the respective predictive factors after the influence of one or both of the 
other predictive factors has been removed. Reference to the top item in the 
olumn will show how the partial correlation coefficients should be read. 
Thus, with the computational arithmetic test scores (Factor 3) held con- 
stant, the algebra aptitude test scores (Factor 2) correlate with algebraic 
ichievement (Factor 1) to the extent of an r of .52. Analysis of the partial 
correlation data reveals either directly or by implication a number of sig- 
nificant facts. First, the algebra aptitude test scores possess slightly more 

predictive power than the scores on either of the other two predictive instru- 


“Walter S. Monroe, Encyclopedi Educational Research, p. 840. M n 
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Evidence for this statement is found in the fluctuations in the size 

of the partial correlation coefficients listed in the upper part of column 2 
second, the net lation of each predictive factor with semester achieve 
» affected by the number of the other predictiy 

wl influence has been ruled out. Third, it appears evident from 


lata that the algebra aptitude test measures arithmetical achievement 


with capacity to achieve in algebra. Fourth, the data seem to imp! 


Ie pi tic value of the aptitude test used in this study might |! 
erially by the inclusion of additional elements specifically 

1 to measure native capacity for algebraic learning 
The coefficients of multiple correlation are listed in column 3 of 
rable I. These show the extent to which semester achievement in algebra 
was forecasted by various combinations of the predictive agents. Reference 
to the top item in column 3 will show how the multiple correlation coefh 
ents should be read. Thus. a combination of Factors 2 and 3 predicted 


achievement to the extent of an R of .837. An examination of th: 

multiple correlation data shows that the highest R was achieved through 
ombination of all three predictive factors, and that the R achieved through 
ymbination of Factors 2 and 3 was almost as high as that achieved through 
combination of all three predictive factors. Further examination of these 
lata shows that Factor 4 adds very little forecasting value not already con 
buted by Factors 2 and 3: in fact, the added value is so slight that th 
inclusion of Factor in the team of predictive tests does not seem to be 


Actual d Predicted Placement. The extent to which the pupils 
semester achievement placement coincided with that forecasted by the predic- 
tive measures used singly and in combination is shown in Table II. Reference 
to the top item in column 1 will show how the table should be read. Thus, 
seventy-nine per cent of the pupils placed in the achievement half forecasted 
by their algebra aptitude test scores Analysis of the placement data reveals 
findings which tend to confirm some of those disclosed by the correlation 
data. First, all of the predictive measures forecasted the achievement hal! 
in which at least three fourths of the pupils placed and the achievement 
third in which three fifths or more of the pupils placed, and two of the 
predictive measures forecasted the achievement quarter in which one half 


more of the p ipils placed Second, scores on the algebra aptitude test 
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8. Morris Iowa Algebra Aptitude Test 
Lee Test of Algebraic Ability 
Orleans Algebra Prognosis Test 


», Orleans Orleans Algebra Prognosis Test 
Sea Orleans Algebra Prognosis Test 
l Torgerson and Aamodt Lee Test of Algebraic Ability 


Orleans Algebra Prognosis Test 


ARITHMETIC Tests USED FOR PREDICTING ALGEBRAIC ACHIEVEMENT 


1. Clifton New Stanford Achievement Test: Arithmetic Ce 
putation 

2. Dunn New Stanford Achievement Test: Arithmetic Co 
putation and Arithmetic Reasoning combined 

3. Kellar Unit Scales of Attainment: Arithmetic Computat 

i. Layton New Stanford Achievement Test: Arithmetic C 
putation and Arithmetic Reasoning combined 

5. McCuen New Standford Achievement Test: Arithmetic Co 
putation 

6. Morris Orleans Achievement Test: Arithmetic Computati 

7. Seagoe New Stanford Achievement Test: Arithmetic C 


putation and Arithmetic Reasoning combined 


Tests FOR MEASURING ALGEBRAIC ACHIEVEMEN' 


1. Dichter Breslich Algebra Survey Test 

2. Dunn Douglass Survey Test for Elementary Algebra 
3. Grover Columbia Research Bureau Algebra Test 

+. Layton Cooperative Algebra Test 

5. Lee and Hughes Columbia Research Bureau Algebra Test 
6. McCuen Douglass Diagnostic Algebra Tests 

>. Orleans Columbia Research Bureau Algebra Test 


In the writer's study, with scores on the Breslich Algebra Survey Te 
as the achievement criterion, scores on the Iowa Algebra Aptitude Te 
yielded an r of .775. Using the same types of variables, Dicter (3) * reported 
an r of .65; Dunn (4), an r of .33; Grover (5), an r of 61; Layton (8 
an r of .66; Lee and Hughes (9), an r of .62; and Orleans (12), a mear 
of .65 computed from r’s obtained in a number of schools. 


In the writer's study, with scores on the Breslich Algebra Survey Te 
as the achievement criterion, scores on the Christofferson—Rush-Guile: 
Analytical Survey Test in Computational Arithmetic, which were used as 
predictive measures, yielded an r of .707. Using standardized algebra test 
scores as the criterion of achievement, Dunn (4) reported an r of .36 wit! 
scores on arithmetic computation and arithmetic reasoning tests combined 
Layton (8), an r of .63 with scores on arithmetic computation and arith 


* Numbers in parentheses refer to studies listed in the selected bibliography 
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tic reasoning tests combined; and McCuen (10), an r of .36 with arith 
omputauion scores 


Some of the same types of predictive sures employed in this study 
marks in algebra as 


ent. The studies in which algebr 


1 in other studies with teacher: he criterion of 


iptitude test scores were cor- 
with teachers’ marks reported results as follows: Ayers (1), an r of 


i 


Kertes (7), an r of .61; Layton (8), 


16; Morris (11), a mean r of .64 computed from r’s obtained by 
scores on three algebra aptitude tests with algebra marks; O1 
leans (12), a mean r of .64 computed from rs obtained in a number of 


in of .64; Lee and Hughes (9), 


scho Seagoe (13), an r of .40: and Torgerson and Aamodt (14), a 
mea of .61 computed from btained by correlating scores on two 
ilgebra aptitude tests with algebra marks 


[he studies in which teachers’ marks in algebra were used as the crit 
eri f achievement and scores on arithmetic achievement tests as predictive 
ported results as follows: Clifton (: ), an r of .45 


with com 


p 1 arithmetic test scores; Kellar (6), a mean r of .62 computed from 
’ tained by correlating « mputation arithmetic test scores with teachers’ 
mat used on algebra computation ability and by correlating computation 
rit tic test scores with algebra marks based on algebraic problem-solving 
Layton (8), an r of .67 with scores on arithmetic computation and 
arithmetic reasoning tests combined; Morris (11), an r of .60 with 


com 
putation arithmetic test scores; and Seagoe 


computation and arith: 


(13), an r of .58 with scores on 


Coethcients of multiple correlation were reported in ten of the related 
s, thus making a further comparison of findings possible. In the writer's 
in R of .837 was obtained between algebra survey test scores and a 
combination of algebra aptitude test scores and 


computational arit] 
I 
test scores, and an R of .845 was obt 


1 by including initial algebra test 


the ten studies which reported coefh 


s in the combination. Only six of 


{ multiple correlation obtained R's above .70. Dichter (3) reported 
in K of .73 between algebra survey test scores and a combination of algebra 
test scores and eighth-grade mathematics marks, and an R or .74 
when IQ's were included in the combination. Kellar (G6) reporied an R of 
78h 


etween teachers’ marks in algebraic problem solving and a combination 
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TRENDS IN THE USE OF STATISTICAL TOOLS IN 
EDUCATIONAL RESEARCH ARTICLES 


MARIAN A. KITTLE 
Indiana State Teachers Colleg: 
Terre Haute, Indiana 


Editor's Note: One measure of the maturity of an a f scientific re 
h will be found in its enability to quantitative methods. The author 


i 


upplies a picture of current trend 


DuRinG the last two decades, a knowledge of statistical tools’ 


nr 


become more and more necessary for a comprehensive reading of pu 


tions and for working in the field of educational research. Unless one |! 


some knowledge of these tools. he finds himself hopelessly lost before gett 


a good start in reading research publications Neither can one mak 
interpretations of his own studies without some understanding along 
line. The average person, not having the time nor the inclinations to d 
deeply into this branch, needs to know on which tools to concentrate. 
The pre blem then is to select those statistical tools which will be 
valuable to the student of education. In doing so it probably will Be be 
inalyze articles written as a result of actual studies made in the field 
educational research rather than articles or textbooks written about rese: 
If we are going to measure the frequency of use of statistical tools dur 
a period of time, we must restrict ourselves in some defensible 1 


ner to a fixed breadth of reporting of researches throughout the 


and also to some relatively uniform standard of quality in selecting 
researches reported. With this in mind, the / urnal of Educational Resear¢! 
was chosen since, as its mame indicates, it specializes in reporting researc 
in education. Also, it has contained approximately the same number of pa 
in each issue throughout the years, and it presumably has maintained a r 
tively high standard of quality in selecting researches for publication. | 
thermore, the Journal of Educational Research started publication at s 
a time that it has existed almost throughout the period of research in edu 
tion when statistical methods were used.* 

All of the data for this study were taken from the Journal of Educ. 
tional Research for the years 1920-1940, inclusive. Each article publis! 


The term “tools” is used to include statistical terms, graphs, and formulas 
* The first tssue was p blished in January. 1920 
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during this twenty-one-year period was carefully scanned, and the tools were 
listed as they were mentioned. In order that undue weight might not be 
attached to any one particular term, its frequency was based on the number 
of different studies in which it was used and not on the number of times 
it was mentioned or computed in each of the various studies. The writer 
does not intend to use a frequency count for determining the importance of 
a tool. She merely wishes to show which ones are encountered most fre- 
quently in publications. Neither does the writer intend to fix a tool's origin 
as the date it first appeared in the Journal. 

Complete articles including bibliographies and notes were used in 
enumerating the pages. However, nothing was done with those sections at 
the end of each issue of the Journal entitled ‘Editorials, “Reviews,” ‘'R« 
search News and Communications,” et cetera. Continued articles which werc 
too long for one issue were treated as single articles. But if studies were 
divided into related parts (Part I, Part II, et cetera) and published in differ 
ent issues for two or more months, they were treated as separate articles. 

Articles dealing with the discussion of procedures, formulas, or short 
cuts were listed under the heading “Discussion Articles,’ and the procedures 
contained in such were ignored because this study was interested only in 
actual applications and not possibilities or merits. 

To simplify the tables and discussions used in reporting the results of 
this study, and so that trends might be better presented, the twenty-one-year 
period was divided into four shorter periods. It was arbitrarily agreed that 
the first period would contain data covering the six years from 1920 through 
1925, and the remaining periods would each contain data covering five years 
In most cases, the data for the various periods could be reported as found, 
but in a few special instances, it was thought best to take five-sixths of the 
first period so that more exact comparisons might be made. 

Merely tabulating the frequencies was not sufficient to show the true 
significance of the number of tools used. Therefore, it was necessary to make 
other notations and report other findings. Along with the tools, their fre 
quencies, and interpretations concerning their frequencies, there are inserted 
tables and discussions dealing with those other phases of the study, such as 
the types of articles reported, percentage of articles and pages included in 
each type, graphic presentations, and various ranges, averages, and fre 
quencies 
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In order that this study might not become too complicated a task, « 
tain restrictions were deemed necessary. Although tables have their pla 
the same as do graphs and diagrams, their frequency was not tabulated, sin 
i large percentage of the tabular data were written in exposition or grouped 
n some other form to facilitate publication. Such cases contained tabular 
data even though they were not set up in tabular form or numbered and 
titled 
Tools such as L.Q., A.Q., and E.Q. which measure intelligence, ment 
omplishment, or educational ranking, or indexes, norms, and ratings aj 
plying to standardized tests were purposely omitted. This study was intereste 
only in tools common to all branches of educational research 
Per cents were not tabulated because they were used at some time o 
inother in nearly every study. Then, too, oftentimes it was the only to 
used, and its extremely large frequency of use would hardly have been « 
ny value. No attention was paid to gains and deviations since they wer 


ly a matter of subtraction and therefore not difficult to interpret. 


mere 

All “errors” were taken for granted to be probable errors unless 
some place in the article they were specifically stated to be standard error 

Various index numbers were also omitted, since most of them, unle: 
applying to standardized tests and scales, were each peculiar to the particula 
studies in which they were used and not general tools that could be uss 
in any research 

Nothing was done with tools which were used in the computing of 
other measurements, such as the standard error of the difference betwee: 
two means in finding the critical ratio of the means, or the sigma in findin 
the probable error, unless the authors deemed them important enough | 
list them and their values or significances. The same is true with the rang: 
ind rank. Tools were tabulated only when their values were listed ir 
tables or when they were specifically mentioned in the exposition. 

Articles classed as non-statistical might or might not contain some o! 
those tools mentioned as having been omitted 


Data directly concerned with the three types of articles are shown 1 
Table I. The articles as they have been classified for this study are those 
(1) containing statistical tools, (2) containing no statistical tools, and (3) 
discussions of procedures. Each five-year period has shown a decrease over 
} 


the preceding period in the number of pages pertaining to articles published 
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in the Journal, with the last five years showing a decrease of 28 per cent a: 
compared with the first period. The number of different articles, however 
did not show such a regular decline. For instance, the second period showed 
the greatest per cent of decrease in pages, but it reversed matters according 
to the number of articles printed by showing an increase of 3.2 per cent 
For the twenty-one years, the total decrease in the number of articles printe 
was 21.1 per cent. From these figures, it seems that in general there was 
tendency to cut down on the exposition but not so much on the number 
of studies. 

The number of articles containing statistical tools showed neither 
constant increase nor decrease during the twenty-one years, and ended wit! 
the last period printing 0.6 per cent less statistical articles than did the first 
period. Nevertheless, the number of pages devoted to this type increased 
0.5 per cent. 

The percentage of non-statistical articles dropped lowest during the 
third five years, but rose to a new high during the last five years. The in 
crease for the last period over the first period was 6 per cent. However 
the percentage of pages devoted to non-statistical articles increased slight!) 
but constantly until the last period showed an increase of 3.8 per cent ove 
the first period. 

As stated before, discussion articles are those which are devoted wholl; 
to the possibilities or merits of some particular procedure, formula, or short 
cut. As the older and more familiar tools became better established, few« 
articles were written discussing them. Both the percentage of articles and 
pages concerning discussions decreased steadily throughout the twenty-on 
years until that for articles dropped from 6.2 per cent to .08 per cent, and 
that for pages dropped from 4.9 per cent to .06 per cent. 

Table II is included to give the reader some view of the popularity 
of the tools in general. One outstanding fact revealed by this table is tha 
although the total number of tools dropped considerably during the {ast 
period, the number of different tools used was higher in proportion thar 
during any other period. Also, the average number of different tools pe: 
article was highest for that period. The only steady change shown by Table 
II is that for the average number of different tools per article, which in 
creased until the last five years showed authors using 36.4 per cent mor 
tools per research than they did during the first period. 
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TABLE II 
I OF STATISTICAL TOOLS IN ARTICLES* APPEARING IN THE JOURNAL OI 
EDUCATIONAL RESEARCH OVER PERIOD, 1920-1940 
Nu t R ‘ 
Total Number of Different Different Tools f Tools 
of Tools Used Tools Used a Single Articl Per Article 
1920-19 
Unw 968 68 1-16 
Weigt 807 
1926-193 779 1-14 9 
931-19 au 1-12 4.2 
1936-194 69 u 1-14 4.! 
3,249 160 1-16 1.96 
irticles wh lass AS 


Thus tar, the data presented has been more or less of a general nature 
to give some sort of a background for the material which follows. Beginning 
with Table III, the data presented will be that from which the conclusions 
were drawn 


Central Tendency. Table III shows the measures of central tendency 
The frequencies for the average and median decreased steadily throughout 
the twenty-one years while the frequencies of the mean for the last three 


five-year periods showed increases over the first period 


TABLE III 


FREQUENCIES OF TOOLS MEASURING CENTRAL TENDENCY 


I 2 192 2 1934 193 19 194 
\ 118 "44 
Me 122 2 
6 
ent endenc 244 200 179 
different tools used i 


Variability and Measures of Dispersion. The old standbys, deviations 
and ranges of centralization, according to Table IV, seem to be slipping 
Only in the case of standard deviation has there been any increase, and its 
frequency dropped considerably during the last five-year period. The stand 
ard deviation also ranks far above the other measures of dispersion in 
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popularity. Next in popularity are the quartiles, one and three,’ which gi 
the range of the middle 50 per cent of the cases. The most constant me 


urement in this group, although its frequency 1s relatively small, is 


artile deviation. On the whole, however, the frequency of use of tl 


measures of variability or dispersion decreased steadily during the twenty 


a one years covered by this study 


FREQUENCIES OF TOOLS MEASURING VARIABILITY OR DISPERSION 


192 192 130 1931-193 1936-194 
Range 153 42 16 33 
ti 2 4 age dé 110 2 3 2 
Mea le 4 1 1 
in de 4 2 l 
Quartile devia 6 
Ir yuar 2 
Qu les (Q) a ( 20 1 13 14 
Quartiles spe 7 
Coett f varia Pearson 2 2 1 
t of variation (general)* > l 
2 Total number 33 134 132 il 
7 7 Number of different tools used 11 11 8 10 
: * The term “general” is used here and in following tables to take care of those cases wher: 
as n specific formula used was not mentioned in the text 
sy oe Reliability. There have been considerable fluctuations in the popularit 
of the various reliability tools as shown in Table V. No one tool has show 
constant increase or decrease for the combined four periods. The probat 
3 error of the coefficient of correlation seems to be the most widely used t 
in this group. Its frequency increased during the first sixteen years, but lik 
the standard deviation (Table IV), it decreased somewhat during the last 
five-year period. 
_ Although it was used for the first time during the second five-ye 


period, the critical ratio, whether found by using the standard error or t! 
probable error, has shown a constant increase in frequency of use. The stand 
ard error has not been used as often as the probable error, probably becau 
it is a newer tool, But even so, like the critical ratio, it has shown almost 


a constant increase throughout the whole of this study. 
* Although quartiles 1 and 3 are measures of position, when taken together, tl 
afford an index of spread 
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TABLE V 


FREQUENCIES OF TOOLS MEASURING RELIABILITY 


I 1920-1925 1926-1930 1931-19 1936-1940 


P. E. of average 2 
P. mean 11 18 
f diar 4 
E. of measurement 1 l 
P. { coeff. of correlation 22 7 
{sigma 1 4 
P. BE. of es ate l l l 
P. E. of coeff. of reliabilit 2 4 12 t 
score 2 2 
{ skewness 2 
P.E fa difference 2 17 i 14 
39 83 109 73 
fav ge 2 
S. FE. of mear l 10 
EF. of me n l 1 
S. E. of coeff. of correlation l 4 
5. E.ofs 1 l 
S. 2 l 2 2 
S.I fy l 
S. E. of n l 
S_ FE fa difference 2 12 26 20 
Total S. 2 12 
Rat { diff. to P. E. of diff. between averages $ l 
Ra liff. to P. FE. of diff. between means 1 ) 
Rat lif. to P. of diff. between medians. | 
Ra fdiff.to P. EB. of diff. between per cents 2 
cal ratios (P. E | 2 . 10 
Ra T S. E. of diff. between averages l 2 l 
Ra ff. toS. E. of diff. between means 9 14 l 
Ra liff. toS. E. of diff. between sigmas l 2 
Rat toS. E. of diff. between per cents 1 } 
| 
otal critical ratios (S. E 11 17 21 
Cr al ratio (general l 
Rat {P. E. of mean to mean | l 
lotal reliability 47 124 18% l4 
Number of different tools used | 14 24 25 


In spite of the fact that there is no regular increase or decrease in any 
particular tool with the exception of the critical ratios, the total use ot 
measures Of reliability, unlike variability or dispersion, has increased con 
siderably since 1920. 

Correlation. No outstanding changes have taken place in the use of 
the various correlations. The frequency of the Pearson Product-Moment 
method has decreased in recent years, but we can draw no decisive con 


clusions from this fact because there is no corresponding increase in any 
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other one method, and also, we have no way of knowing what per cent o! 
those correlations (general) whose methods were not given were compute 
by means of the Pearson Product-Moment method. Another change whic! 
merits mentioning is the increased use of the coefficient of reliability. Th 
total for all correlations increased only slightly during the twenty-one year 


TABLE VI 


FREQUENCIES OF TOOLS MEASURING CORRELATION 


Tool 1920-1925 1926-1930 1931-193 1936 194 
Coeff. of correlation (general 7 47 63 49 
Pearson Product- Moment 27 29 16 13 
Spearman Rank - Difference 7 5 4 
Spearman Foot- Rule | 4 8 
Intercorrelation 10 8 11 10 
Partial correlatior 7 i 4 6 
M ultiple correlation i 4 2 7 
ero order correlation 1 i 1 4 
Bi serial correlation l 4 
Self correlation 2 1 
Coeff. of reliability (general 4 ll 18 
Coeff. of reliability (Spearman-Brown 4 16 
Coeff. of reliability (Pearsor 2 1 
Coeff. of validity 1 3 > 
Coeff. of alienation l 3 l 1 
Coeff. of attenuat 2 2 
Coeff. of prediction l 1 
1 
Eta (correlation ratio l 2 1 
Regreasion equations, formulas, and 

coefficients 4 6 

Total correlation 131 134 163 152 
Number of different tools used 12 17 17 19 


Miscellaneous Tools. The miscellaneous tools are those which hav 
been used more than once during the twenty-one years but which do not fi 
into any of the beforemertioned classifications. The total frequencies for th 
group as a whole decreased steadily and considerably throughout the twenty 
one years. 

Tools Used Only Once. The tools which were used only once wer: 
divided into two distinct groups. One group includes tools which are self 
explanatory such as the Otis Difference formula, ratio of gain to sigma 
and Yule’s coefficient of association, but which have not been used ofter 
enough to justify including them in their respective sections and thereby 
complicating tables needlessly, since no tendencies could be revealed. The 
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other group includes tools which, unless explained by the authors using 
them, mean nothing to the average reader. In fact, many of them mean little 
to the more experienced readers. A few examples of this second group are 


l 

coeficient of determination, lexis ratio, Dp’ and tetrachoric correlation 
This group includes many more tools than does the first group. 
TABLE VII 
FREQUENCIES OF MISCELLANEOUS TOOLS 
Tool 1920-1925 1926-1930 1931-1935 1936- 1940 

Rank and ranking 37 22 15 15 
Weights and weighting-. 7 2 1 9 
Percentiles and percentile rank 10 10 16 I 
Deciles and decile rank_. ; l 2 
Quintiles and quintile rank 1 l 
Centiles l 
Order rank 1 1 1 
T-score and T-scale 7 1 
Skewneas 1 
Standard scores 2 2 1 
Probat y 2 2 | 8 
Experimental coefficient l 4 
Transmuted scores l 2 l 
Curve types 2 l 
Biakeman's Test (Zeta l 1 

Total Miscellaneous 67 71 53 41 

Nu of different tools used x 13 13 10 


The total frequency of these two groups combined remained constant 
the first eleven years, more than doubled during the next five years, and, 
although it decreased slightly during the last five years, it still remained 
nearly twice as large as it was during either of the first two periods. The 
four frequencies were: first period, eleven; second period, eleven; third 
period, twenty-six; and fourth period, twenty. 


Graphic Presentation. Table VIII is approximately a continuation of 
Table II. It is significant in that it depicts the steady decrease in both the 
number of graphs used and the number of articles using graphic presenta 
tions. More and more writers are relying on exposition and semi-tabular 
forms for the presentation of data. This can probably be accounted for by 
the increased cost of publication when graphs are printed. Also, there are 
many graphs made which are unnecessary, since they are taken from tables 
clear in themselves. 
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TABLE VIII 


Use OF GRAPHIC PRESENTATIONS IN ARTICLES APPFARING IN THE JOURNAI 
OF EDUCATIONAL RESEARCH OVER PERIOD, 1920-1940 


Number Total Number Number Average 
Period of Articles of Graphs of Different Range Number 
Using Graphs Used Kinds Used per Article per Article 
1920-1925 
Unweighted 70 106 1-5 l 
Weighted 
1926-1930 i9 61 ? 1-5 1.2 
1931-1935 37 43 ; 1-2 1.2 
1936-1940 is 7 1 6 1.4 
Total 186 253 11 1-6 1.4 


The only noticeable change was during the third five-year period. Th 
change was in the range of different graphs used in any one article. During 
the whole period, two different types were the most ever used in a sing 
article, while the maximum was five or more for the other three periods 
However, the average number of different types per article used during tl 
four periods remained nearly constant. Seven was the smallest number o! 
different graphs used during any one period, and eight was the most 


Throughout the twenty-one years, only eleven different kinds were used. 


TABLE IX 


FREQUENCIES OF THE VARIOUS TYPES OF GRAPHIC PRESENTATIONS 
Usep DURING THE PERIOD, 1920-1940 


Type of Graphical Presentation 1920-1925 1926-1930 1931-1935 1936-194 
| 

Frequency polygon or modification of i7 10 8 1 
Histogram or modification of 9 3 4 

Line graph or modification of 32 19 15 18 
Historigram or modification of 5 3 1 
Percentile chart or modification of 2 2 1 

Bar graph or modification of 30 20 10 17 
Scattergram or modification of 10 4 4 4 
Component-band chart_. 1 
Excess-deficit chart l 
Modified-multiple dot 1 
Organization chart 1 


Total number used | (106 61 43 43 


4 
| 


One LUSIONS 


When this study was planned, it was suspected that trends would be 


rly well pronounced; that some tools which were popular fifteen orf 
twenty years ago had fallen into disuse and that some tools are being used 
now which were unknown at the time the Journal started publication. On 
the contrary, the data proved just the opposite. No new tools of any notice 
able popularity have appeared in the Jowrnal since the period, 1926-1930 
Neither have any of the frequently used tools of twenty years ago decreased 
n popularity to any noticeable degree. On the whole, frequencies hav: 
remained fairly constant 

While the five-year total for articles containing statistical tools fluc 
tuated, the average number of tools per article increased steadily until authors 
were using more than twice as many during the last period as they wer 
luring the first period. 

Writers are breaking away from the old idea that everything must be 
shown in both tabular and graphic forms. If data are few and simple 
exposition is used solely 

In the case of central tendency, the mean has increased in frequency as 
he median decreased. This tends to indicate that researchers are using the 
median only in limited cases of central tendency such as salary or wage 
studies They are using the tools which will tell them the things they want 
most to know. 

The writer feels, after an examination of the articles during these 
twenty years, that a larger degree of standardization of terms and symbols 
would make for ease of reading and understanding research reports. To give 
only a few examples, some things like the following were found 

The small letter k was used as a symbol by various people to stand for 
probability, Chauvenet's Criterion, and correlation. In a standardized vocab- 
ilary, the reader, when seeing ‘‘k,’’ would know instantly the measure used 
whether or not detailed explanations were given 

Probability was indicated by “k,” “'t,” and “'P”. This is confusing to the 
beginning statistician or researcher who heretofore, perhaps, has known 
probability only as “'t’’ 

The standard error of a difference was found abbreviated in three differ- 
ent ways — S. E. aire, S. D. and Again the novice is confused if 


he has been taught to recognize only one of these three abbreviations 
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he average reader knows when an author speaks of the .critical ratio 
of a difference that he is speaking of the ratio of a difference to the proba 
error or standard error of that difference. But does the average reader kn 
that the author means the same thing when he speaks of the index 
significance or the reliability of the difference? 

Although these examples do not exhaust the list of confusing definit: 
and abbreviations found in this study, they are enough to indicate the 
portance of a standardized vocabulary. 

If the frequencies of the different tools, as found by this study, w 
used as a basis for planning a course of study, such as a course in thes; 
writing, it seems a working knowledge of the following tools and th 
significances would give the novice an adequate background for his o 
researches and for comprehensive readings of other studies. 


1. Central Tendency: 
Average, arithmetic mean, and median 

2. Measures of Dispersion: 
Standard deviation, average or mean deviation, median deviatior 
quartile deviation, and coefficient of variation. 

3. Probability: 
Probable error and standard error of the most frequently used measur 
of central tendency, dispersion, and correlation, probable error 
standard error of a difference, and critical ratios. 

i. Correlations: 
Pearson Product-Moment method, Spearman Rank-Difference method 
intercorrelation, partial correlation, multiple correlation, coefficient 
reliability, and regression. 

5. Miscellaneous: 
Range, rank and ranking, weights and weighting, percentiles and p: 
centile rank, T-score and T-scale, and skewness. 

6. Graphic Presentation: 
Frequency polygon, histogram, scattergram, line graphs, and bar graphs 
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A STUDY OF THE LEARNING AND RETENTION 
OF MATERIALS PRESENTED BY LECTURE 
AND BY SILENT FILM 
CLARENCE D. JAYNI 


Central State Teachers College 
Steven Point, Wossconsin 


Editor's Note: Science has brought to education many valuable new in 
struments of learning and teaching of which the silent motion picture is 
one. The author presents data on the value of this instrument 


[HE purpose of this study is to investigate (1) the informational gains 
made by pupils listening to a lecture, as compared with gains made from 
‘eeing a silent motion picture presenting the same material, and (2) the 
retention of the informational gains made by the two methods of presenta 
tion over varying periods of time from three weeks to fifteen weeks. 

With the renewed interest in visual education in recent years there 
has been considerable emphasis placed upon the importance of the visual 
approach to teaching. Many studies give consistent evidence that pupils 
make greater gains when visual materials are used. In considering teaching 
nethods it is important to know whether these greater gains are due pri 
marily to the visual presentation alone, or to the visual presentation plus 
other activities such as teacher preparation and follow-up, pupil discussion, 
etc. If the visual presentation alone is responsible for the effectiveness of 
teaching procedures based on the use of pictures, then teachers would be 
justified in the mere showing of pictures. If, on the other hand, it is found 
that the mere showing of pictures does not seem to be an effective teaching 
procedure, then teachers need to be much concerned with the teaching 
devices used to supplement the seeing experience 

It must be kept in mind that this study is not a comparison of the 
visual method and the lecture method. As the study is described it will be 
apparent that it is rather a comparison of one way of using silent motion 
pictures (the mere showing of the pictures with no supplementary pro 
cedures) with a lecture method which included the use of such visual 


materials as blackboard diagrams, et« 


EXPERIMENTAL PROCEDURE 


The pupil population used in this study was made up of 271 pupils in 
the ten freshman general science classes of the P. J. Jacobs High School in 
Stevens Point, Wisconsin. The number of pupils in each class ranged from 
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23 to 30, as 1s shown in Table I. The classes were not organized on t! 

basis of pupil ability, but each represents, so far as could be determined 

rand6ém sampling of the total freshman population so far as any factor 
I 


oncerned which might have influenced the results of this study. 


r 


Two teaching units were used in this study: the first an elementar 
presentation of relativity, and the second a study of petroleum productio: 


These units were selected for the following reasons: 


1. The subject matter was of a type which is commonly used in gener: 
science classes 

2. A motion picture dealing with each of these units was availabl 
with a running time of approximately 30 minutes 

3. None of the ten classes used in the study had had any previous cla 
work, so far as could be determined, in either of these units. 

1. None of the ten classes used in the study would normally have ha 
any class work closely related to either unit during the 15 week perio. 
covered by the study 


Ihe films used in the study were “Relativity”, and “Petroleum.” Bot 
films were in satisfactory condition and were thought to be quite typical « 
the semi-technical general science silent films, combining ordinary photos 


raphy with animated drawings and explanatory titles 


Ihe lectures used in the experiment were carefully prepared by on 
of the general science teachers on the basis of notes taken from repeat 
viewings of the films. Every effort was made in the lecture to present a 
the essential ideas presented in the film. The lecture was illustrated by black 
board sketches and diagrams similar to some of the animated drawings use 
in the films. The lecture was worked out so as to take the same time as was 
needed to show the film. The same teacher gave the lecture to all classes 
ised in the study and, as nearly as was possible, presented the material in 


the same way to each group 

I'wo objective tests, one covering the factual material presented in the 
film ‘Petroleum” and the other covering like material presented in the filn 
Relativity” were used as the measuring instruments in this investigation, Th« 
test on “Petroleum” had 69 items, part being true-false, part multiple 
hoice, and part completion. This test was found to have a reliability of .76 


ui 


as determined by the chance half method and the application of th: 
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Spearman—Brown prophecy formula. The test on ‘Relativity’ was made up 
of 65 items of the same types used in the test on “Petroleum”. This test 
vas found to have a reliability of .68. 


[he general experimental pattern is presented in Table | 
TABLE I 
EXPERIMENTAL PROCEDURE AND INFORMATION REGARDING CLASSES 


USED IN THE STUDY 


Group Class N« n N " I 


ecture Filn Retention 
Class Group Unit Unit Interval 

I 30 Petroleum Relativity 3 
I 2 25 55 Relativity Petroleum 3 
3 30 Petroleum Relativity 6 
II 4 23 3 | Relativity Petroleum 6 
28 Petroleum Relativity 
II 6 27 Relativity Petroleum 9 
7 30 Petroleum Relativity 12 
27 Relativity Petroleum 12 
4 25 Petroleum telativity 15 
\ 10 26 51 Relativity Petroleum 15 


As will be seen from Table I the ten classes used in the study were 
into five groups, each made up of two classes. Each gro ip may 
nsidered as an experimental unit, as the same procedure was used with 
each, with the exception that the time interval between the teaching of the 
lessons and the delayed recall test varied, as shown in Table I, for each 
group. 
The “rotated group method’’ was used with each of the five groups 
Thus while one of the classes in Group I had a lecture on petroleum and 
the film on relativity, the other had the film on petroleum and the lecture 
on relativity. It will be observed that each class had a film lesson 
ecture lesson. This procedure probably “rotated out” such factors as differ- 
ing abilities of the two classes, the difference in the type of material in the 
two units, the difference in the order in which the procedures were used, etc 
On Friday, February 20, the test on Petroleum was given as a pretest 
to all ten of the participating classes. On the following Monday, February 
23, the lecture on Petroleum was given to classes 1, 3, 5. 7. and 9. The same 


*W. A. McCall, How to Experiment in Education. New York: Macmillan Com 
pany, 1923. Pp. 32-33 
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day the film on Petroleum was shown to classes 2, 4, 6, 8, and 10. Tuesda 
February 24, the test on Petroleum was again given to each of the 
classes. The difference between the pretest score and the second appli 
1 the test was considered as the gain for the teaching procedure used. 
On Wednesday, February 25, the test on Relativity was given as 
test to all of the classes. On February 26 the lecture on Relativity was 


to classes i, 6, 8, and 10. The same day the film on Relativity was gi 


to classes 1, 3, 5, 7, and 9. On February 27 the test on Relativity was giv 
igain to each of the ten classes so that the gain from each of the teaching 
procedures could be calculated. 

In order to determine the retention of the material learned over var 
ing lengths of time, the tests on Relativity and Petroleum were given a thi: 


time to each group according to the following schedule: 


Group I consisting of classes 1 and 2, 3 weeks after teaching. 
Group II consisting of classes 3 and 4, 6 weeks after teaching 
Group III consisting of classes 5 and 6, 9 weeks after teaching 
Group IV consisting of classes 7 and 8, 12 weeks after teaching. 
Group V consisting of classes 9 and 10, 15 weeks after teaching 


FQUATING OF CLASSES 


The rotated group method of experimentation rotated out certain diff 
ences between the two classes in each of the experimental groups, but 
course it did not in any way neutralize differences between the five group 
Since the portion of the study dealing with retention required a comparisor 
of the retained gains made by each of the five groups, it was obvious tha 
these groups should be of equal learning ability. The equating was done « 
the basis of the gains made from the films or lecture presentation. 


TABLE II 


EQUATING OF FivE CLASSES THAT HAD FILM ON PETROLEUM 
AND LECTURE ON RELATIVITY 


Class Ne M Film M Lecture OM Film | O M Lecture 
Pupils Gain Gain Gain | ain 
2 17 15.42 .9 15.6@ .5 5.6=.7 3.32.4 
4 14 15.3 #1.3 .7 | 7.22.9 3.72.5 
20 15.3 #1.0 15.6" .7 6.7*.7 4.4.5 
8 13 14.6#1.2 16.12 .9 | 6.42.9 4.7.6 
10 15 14.7#1.1 16.3 #1.2 4.5.5 4.7#.5 
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For purposes of equating the classes it was necessary to divide them 

into two groups; the five classes that had the film on Petroleum and the 
ture on Relativity, and the five that had the film on Relativity and the 
ture on Petroleum. Individuals were eliminated from each class until the 
mean gains and sigmas were equated to within about a standard deviation 
(Tables I] and IIT), although in a few cases the differences between sigmas 
were slightly greater. Closer equating would have demanded reducing the 
number of cases more than seemed justified. As will be seen in Tables II 
nd III the process of equating reduced the size of the classes so that they 
inged in size from 12 to 20 students, Two series of calculations were made 
from the data, the first based on the equated groups, and the second on th 
tal population It is interesting to note that the same general results were 
btained when the total population was considered, as when the smaller 


quated groups were used 


TABLE III 


EQUATING OF FivE CLASSES THAT HAD FILM ON RELATIVITY 
AND LECTURE ON PETROLEUM 


Class No. M Film M Lecture Oo M Film o M Lecture 
Pupils Gain Gain Gain Gain 
19 | 12.4#.5 25.02 .8 3.5@.4 22.5 
13 12.0.8 25.7#1.1 4.2@.5 6.7 
12 12.3.8 26.121.0 4.1#.5 
14 12.2.8 26.3+1.0 4.42.5 1.62.6 
4 14 12.4.9 25.8.9 4.92.6 132.7 


ANALYSIS OF DATA 


The immediate gain from the film or the lecture presentation was 
calculated for each pupil by finding the difference between the pretest score 
and the score made on the same test the day after the teaching. The amount 
of retention for each pupil was calculated by finding the difference between 
the pretest score and the score made on the same test at the end of the time 
interval in question. 

From the data thus assembled the mean gain for each class from each 
procedure was calculated. Inspection of these data showed that larger gains 
had been made in the lesson on Petroleum than in the lesson on Relativity, 
due probably either to a difference in the type of material or to a difference 
in the tests used, and that therefore the raw gain scores in Relativity were 
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not directly comparable to raw gain scores in Petroleum. Since our analys: 
demanded the summation of the gains made in the two units in order to find 
the total lecture gain and total film gain for each group, it was thus nec: 
sary to change all raw gain scores to standard scores.* 

The mean gain for each class for film and lecture procedure was then 
calculated in terms of standard scores and the mean gain for each pair of 
classes (Groups I, II, III, 1V, and V) was calculated both for the immediate 
recall and for retention after the stated time interval which varied fro 
Group to Group. 

The results have been tabulated in Tables IV and V. 


TABLE IV 


GAIN MADE IN TERMS OF STANDARD SCORES AS A RESULT OF FILM OR LECTUR! 
PRESENTATION OF MATERIAL TO TEN EXPERIMENTAL CLASSES 


| 
| I | M | M M Difference 
Class | Lecture | Film Group | Lecture Film (Lee. G— C. R 
| Gain | Gain | Gain Gain Film G) 
1 | 29 | 23 
2 2.8 | 1.8 2.9 | 2.0 9.09 10 
3 0 2.2 | a 
4 p.8 | 1.8 Il 2.9 | 2.0 9.14 6.4 
3.0 2.2 a 
6 2.8 1.1 Ill 2.9 1.9 1.0.13 | 7.7 
7 3.1 2.2 | 
Xs 2.9 1.7 IV 3.0 | 2.0 102.12 | 8 
10 2.9 1.7 | 6.4 


| 2.9 | 2.0 | 


The immediate mean gains and the retained gains were also calculated 
for the original classes (as they were before equating) and the results hav: 
been summarized by Groups in Table VI. 

It will be noted that each of the ten classes made a larger immediat: 
gain from the lecture than from the film presentation. Since the comparisons 
of film and lecture gains for a single class are based on a film gain from on« 
unit of work and a lecture gain made from another, the comparison of the 
groups, where mean gains from both units by both procedures were obtaine: 
is probably more significant. Inspection of these data for the five group 


*Standard score 
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hows that in each case the lecture gain was greater than the film gain, an 
that the difference in each case was statistically significant, as the critica 
ratios ranged from 6.4 with Groups II and V to 10 with Group I. In term 
of standard scores, the gain from the film was 69 per cent of the lectur 
gain, when considering the results from all ten of the equated classes. 

The consistency with which the immediate lecture gains were larger 
ind the large critical ratio of the differences, leaves little doubt of t! 
superiority of the lecture over the film in this particular experimental sit 
uation. The next question to be considered is that of the retention of th 
gains made. Is there any tendency for the material presented by means 
the film to be remembered better than material presented by lecture? 


An inspection of the data from Table V shows that in seven cases out 
of ten the lecture gain remained greater than the film gain at the time th 
delayed recall test was given. In classes 5, 7, and 9, which showed a larger 
immediate gain from the lecture procedure, at the end of 9, 12, and 1° 
weeks respectively, the retention of the film material was greater than that 
of the lecture 


Turning from the individual classes to the five experimental group 
it will be noted that in each case the lecture gains remained greater thar 
the film gains during the period covered by the study. It will be furthe: 
noticed, however, that the significance of the difference between the two 
procedures, as indicated by the critical ratio, tended to become less with 
the passage of time. Thus the smallest critical ratio based on differences in 
immediate recall was 6.4 (Table IV). At the end of three weeks the critical 
ratio was 5.0, at six weeks 3.9, at nine weeks 4.1, at twelve weeks 3.5, an 
at fifteen weeks, only 2.4 

Turning our attention now to the retention curves for lecture and film 
material (Figure 1), our attention is immediately called to the unusual sit 
uation existing at the end of the sixth week. The group tested at the end 
of six weeks (Group II) actually retained more of the lecture material and 
just as much of the film material as Group I tested at the end of three weeks. 
hat this was not a chance situation produced by the elimination of a por 
tion of the population by the equating procedure is shown by the fact that 
when the entire population is considered, Figure 2, the same superiority for 
Group II, tested at the end of six weeks, is still evident—in fact, is more 
conspicuous. Figure 2 shows the strange situation of a group remembering 
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is much, as shown by their mean score, after six weeks, as the ten classes 
s a whole could recall immediately after the teaching 


Since Group II did not have superior learning ability (all groups were 


ated on the basis of gains made from teaching) it seems that their un 


isually high scores on the delayed recall test, as compared with the other 

r groups, probably was due to some advantage derived from the nature 
f their work during the six week period, or to some advantage gained in 

testing situation. Circumstances have made it impossible to investigate 
ese possibilities. It would seem, however, so probable that Gro p Il did 
lerive some advantage of this sort, that the results from this Group will be 
lisregarded in our further consideration of the retention curves, and dashed 


lines have been drawn on the curves to indicate the curve if Group II were 


liminated 
The retention of the lecture material (Figure 1) was found to suffer 
he greatest loss during the first three weeks after learning (an average of 


»3 standard scores per week) then leveled off to an almost uniform rate of 


ss (disregarding Group II) from the third to the 12th week, with an 
verage loss of about .05 standard score per weck, and then remained on a 
level, showing no loss from the twelfth week to the fifteenth week. The 
results were essentially the same with the entire group (Figure 2) as with 
the equated group, the greatest difference being that the curve became prac 
illy level with this group at the end of nine weeks instead of twelve. 
. Probably the greatest difference between the curve for the retention 
of film material and that for the retention of lecture material is in the first 
three week period. There while the lecture group lost an average of .23 
scores per week the film group lost only .1. From thie third week to the 
twelfth week the curves are very nearly parallel showing almost exactly the 
same rate of loss for the film and lecture material. The group tested at the 
end of the fifteenth week (Group V). Figure 1, showed a larger retention 
than Group IV tested at the end of the twelfth week. That this may hav 
been due to the sample of the population included in the equated group is 
indicated by the fact that when the entire population is considered (Figure 
2) Group V retained less than Group IV 
Our data seem to indicate that during the first three weeks the lecture 
material was forgotten more rapidly than the film material. The question 


should be raised as to whether this greater loss was due to differences in 
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the original methods of presentation, or to other causes. In other word 
would our data justify the conclusion that material presented by film is bet 
retained over this three week interval than material presented by lectu: 


Other studies of retention have shown that a large amount of forgettin 
can be expected to accompany a larger amount of learning.* This is exact 


a: the situation shown by our data. The lecture method produced greater leat 

ing, as measured by our tests, and suffered greater loss from forgetting, | 
a ilways retained its superiority in gain. It thus seems probable that the diff 
. ences in the rate of forgetting may be due to differences in the amor 
learned by the two procedures rather than to differences in the procdur 


themselves. There seems to be nothing in our data to indicate that there 


any real difference in the retention of materials presented by the two metho 
described. 
To further compare the retention of materials presented by lecture 


film, curves were drawn to show the percent oi the material learned tl 


was retained over the various time intervals used in the study for both tl 
equated groups and the entire population. (Figures 3 and 4) These cu: 


show that at the end of three weeks the percent retained was considerab! 


larger for the film than for the lecture groups (85 per cent for film as « 


pared with 76 per cent for the lecture with the equated groups) but as t! 


went by the difference in the percent retained tended to become less. T! 
at the end of twelve weeks the percentage retained for both procedure 
almost identical with the equated groups. The gain in percentage retai 


by the group tested at the end of the fifteenth week is probably duc 


chance factors in equating as pointed out before. It will be noted that wh« 


the entire population is considered (Figure 4) that the percent of retentior 


at the end of the fifteenth week is the same for both procedures, 59 per cet 


CONCLUSIONS 


This study has compared the immediate gain and retained gain o\ 
three week intervals up to fifteen weeks on material presented in two wa 
The first was a lecture presentation making use of blackboard diagrams at 


charts. The second was the showing of a silent motion picture without int: 


*A convincing demonstration of this is the Arnspiger Sound-Picture Experim« 
reported in Devereux, Frederick L., The Educational Talking Picture. Chicago, Ihlins 
The University of Chicago Press. Pp. 76—93 
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duction or comment by the teacher—a purely visual experience. The lectur 
and the film covered the same units of material and each required a 3 
minute period for presentation. 

The study seems to indicate that the increased learning which come 
from the use of visual materials, as determined by many investigations, i 
not due primarily to the visual experience alone, but rather to the addin, 
of a visual experience to other teaching procedures. It suggests that vis 
experiences alone (the mere viewing of pictures in this case) may be les 
effective than the lecture method, at least for informational learning. Visu: 
experiences, when integrated with other experiences, such as listening to t! 
teacher as she raises questions or makes explanations, participating in di 
cussions, etc., have been demonstrated to be effective. Teachers probably 
not, however, justified in assuming that the visual experience is so effective 
that other types of experience should be eliminated in its favor. The most 
effective learning will probably come from the proper integration of man 


types ot expe rience—not from concentration upon one. 
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EDITORIAL 


A RESEARCH SERVICE FOR FIELD WORKERS 


THE writer in pursuing a hobby outside the field of professional edu 
ation these recent years has had an opportunity to make use of the agri 
cultural service of three large mid-west universities. Gradually the thought 
has arisen that professional education has much to learn from agriculture 
Few would deny that the leadership in this area has been superb and 
productive. 

While different persons looking at this service will see and emphasize 
different aspects, the writer sees, first of all, a service thoroughly grounded 
in research. In general only a very small percent of the educationalists in any 
iven institution of higher learning are engaged in research, and the most 
of these only on a part time basis; the agriculturists have many research 
workers working on a full time basis, and others giving substantial portions 
of their time to it. Almost everyone at the university level is engaged in 
research of one sort or another. There seem to be three types of research 
workers: (1) the “pure” scientist, found largely in the science departments 
of liberal arts colleges; (2) the applied laboratory researchers in biochem 
istry, bacteriology, plant pathology, entomology, genetics, etc.; and (3) the 
field researchers in soils, animal husbandry, agricultural engineering, field 
crops, and management. They all seem to be guided by a deep desire to get 
it verifiable facts and to study the problems of farm folk 

One distinct difference between the services of professional cducation 
and agriculture is the extensive university bulletin service of the latter. While 
educationalists have their bulletins there seem to be few bulletins grounded 
in research and designed for field workers. This situation may afise in part 
from the conventions that prevail in educational problem solving, or from 
reporting practices in this field. Of the research publications that are avail- 
able many feel that they are too\technical to be particularly helpful to field 
workers. Some, and possibly rightfully so, will say, too, that the difhculty 
has arisen in part from the lack of financial support for educational research 
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projects and bulletin services. Be this as it may, field workers frequent 
feel that they are not served. Nowhere in education does one find a servi 
approximating that of the agricultural college in readability, practical assis’ 
ance and scientific worth. Agricultural workers have their scientific journa 
and meetings where they get together to secure new technical leads, but ti 
dispensing of scientific information to farmers is turned over to profession 
writers trained in the subject matter of agriculture who prepare bulletins o 
all sorts of practical subjects. These bulletins cover such topics as, keepi: 
farm animals healthy, feeding beef-cattle, care and repair of farm mach 

ery, corn silage, soil building practicess storage diseases and their contr 

sweet clover and its culture, diseases of sheep, farm records, legumes in s¢ 
improvement, drainage methods, home dehydration of foods, contour far 
ing, selecting and storing seed corn, praning apple trees, copper sprays ai 
insecticides, and hundreds of others. Our journals, bulletins, and summari: 
of research serve a valuable purpose but they are not of immediate servi 


as a rule, to field workers 


The extension service of agriculture is second to none. Any staff mem! 
may be called at any time for expert advice and research upon pressing pro! 
lems. The local clearing house is the county agent. Education has a count 


agent too, but by tradition and legislative enactment his position 1s of 


very different sort. Generally speaking, these agents are not dispensers o! 


scientific understanding, but very busy persons charged with the admit 


istrative responsibilities of inspecting and operating schools. The getting 


scientific information to teachers and others who may use it has, by and 


large, not been successfully achieved. Not only must there be research b 
thete must be local leadership in dispensing research information. Possib| 
the real lack is that of having something to dispense. 


The writer in making this statement is not unmindful or unappreciatiy 
in what is here said of the very large amount of research done in educatio: 
Our bureaus of educational research, clinics, and laboratories in universitic 


and large city school systems render a valuable service; some are more ser\ 


4 


iceable than others depending upon the training, experience and outlook of 


their persongel, and the opportunities provided them. We have spoken, t 
principally of neglected opportunities, needed help and our reporting service 


Very much more research needs to be done on problems considered important 


by field workers, and a very much better program for synthesizing and di: 
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seminating this information than we now have needs to be developed 
Neither does the writer think that the job can be done by anyone group of 
researchers. We need “‘pure’”’ researchers, applied laboratory researchers, and 
field researchers; each has a place and an important contribution. We are 
completing now with this decade about fifty years of educational research 
The initial enthusiasm has passed but the ground work for the later struc 
ture has been laid and much splendid work accomplished. The time ha 
ome, however, for the hard but not necessarily the more prosaic work of 
making education a science. Educationalists might take a leaf from the book 
of agriculture. 
A. S. BARR 
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Address books to be reviewed and other communications to A. S. Barr, 
Department of Education, University of Wisconsin, Madison 6, Wisconsin. 


Gray, §S., Editor. Adapting 
Reading Pr gram rf Wartime Need 
Supplementary Educational Mono 
graphs, No. 57. Chicago: Department 
of Education, University of Chicago, 
1945 Pp vin + 284 
Adapting Reading Programs to War 

Needs, is a clearcut report of the 
proceedings of the Sixth Annual Confer 
ence on Reading, held at the University 
of Chicago. The report begins with Dr 

Gray's pertinent account of current de 

mands for the improvement of reading 

In spite of President Roosevelt's reading 

programs as shown by the Army tests, 

scientific evidence reveals an increase in 
the efficiency of the teaching of reading 
luring the last twenty-five years. But the 
demands of today’s complex society have 
increased proportionately, and there is an 
urgent need for a higher level of literacy 


Many adjustments in our reading pro 


grams are imperative, if the adult desired 
minimu level is raised from a grade 
re of 7.0 to that of 9.0. School sys- 


tems must face the responsibility of self 
appraisal and improvement 

Following Dr. Gray's stirring introduc 
tion, the monograph is divided into eleven 

irts. They deal with reading and pupil 

idance in wartime, materials to de- 
velop emotional stability on all school 
levels from primary through junior col- 
lege, techniques for greater reading eff- 
ciency, the teaching of the literature of 
power, comics, map-reading and news in- 


62 


terpretation, reading growth in the « 
tent fields, the needs of poor readers, sf 
cial wartime problems in high 
and colleges, co-ordination of rea 


programs and illy, a summary 


Wartime reading needs of children a: 
youth are brought into sharp focus 
excellent practical suggestions are offer 
guidance in areas of reading both 
escape and for fulfillment. The need 
broad knowledge about the war den 
skill in reading of many kinds of fact 


materials such as ne Wspapers, 


But accompanying this need is 
equally as great but often forgotten in 
emphasis of work type reading in 
schools. It is the need for reading 
erature of power. This has been vivi 
described as: “Whatever in literature 
bring statistics to life, or make histor 
characters move in the setting of famili 
humanity, or clothe with flesh and bl 
the cold figures of a casualty report 
the newspapers, or help the reader 
gain insight into his own way of life 
this is literature of power.’* Only throu 
the guidance of such literature can » 
hope to arouse the imagination of you 
so that there will be visions of ways 
defend the ideals of today and to perpet 
uate them into the world of tomorrow 


*Part IV The Teaching of the Lit 
ature of Power and Imagination, John 


De Boer, p. 133 
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A stimulating approach to the problem 
the comics was suggested. We should 
talize upon the challenge comics offer 
are based on popular appeal, 
rigid moral standards, are adven 
use the hybrid form of language, 


picture and printed word, are power 


reflect the 
ot the 


Che report of the efforts to c 


con 
times 

ordinat 
ling programs in the Chicago publi 
valuable for administrators 


Is 


Reading committees in each elementary 


|! and the appointment of reading 


linators in each high school are sug 


ns for an effective attack 


[ recommend this monograph as a 
tical reference not only for teachers 

i administrators, but also for laymen 
vho will find it useful in various kinds 
f nunity service. Civic minded lay- 


uld use it 


st nt bure ius, 


for guidance in ad 
recreational centers, and 
y other places. 

MARY WILLCOCKSON 


Mu 7} niversity 
CHEYDLEUR, FREDERICK D. Placement 
in Foreign Language at the Uni 
rsity of Wisconsin. Madison, Wis 


isin: Bureau of Guidance and Rec 


rds, University of Wisconsin, 1943 
A recent bulletin published by the 
Bureau of Guidance and Records of the 
University of Wisconsin is one of a 
ries of reports on placement exami- 
ns in foreign languages at the Uni 


versity of Wisconsin by Professor Cheyd 
eur who has been in charge of this work 
nce its inception. Professor Cheydleur 
who has so ably directed the work of 
placement language at the 
pioneer in this 
Through his vigorous and com 
petent direction, Wisconsin attained early 


in foreign 
University has been a 
field 


leadership in foreign language 


ment. His scholarly research and num 
eports of the progress of the place 
ment program at Wisconsin and else 


His 


efforts in this field have done 


where has received wide recognition 


untiring 


much to stimulate others to use objective 


examinations in foreign languages in in 


stitutions of higher learning throughout 


the country. Fifteen years ago placement 


had not been introduced in t colleges 
of this country. Today 1 institutions 


of higher learning sponsor gram of 


placement in foreign languag 
The present bulletin is a summary of 


the results of the use of foreign language 


placement examinations at the University 
The ef 


fectiveness of this program, which in 


over 


a period of thirteen yea 


cludes the results of the tests admin 


istered to approximately 8,000 students 


of better curricular 


is analyzed in terms 
adjustment, reducti n of tailures, and an 


increased accumulation of credits and 


electives 
The bulletin calls attention to the es 


tablishment of the program in 1930 as a 


Fish 


result of the Report of th Cur 


riculum Committee. Freshmen enrolling 
in French, Spanish, German and Latin 
were given the American Council Tests 


and the Columbia Research Bureau Tests 
in these subjects. The placement program 


is described as follows 


“Our plan is to leave in normal pi 
sition those students whose mean scores 
on the tests are approximately the 
norms for the 
high school units in college 
offered. Next to promote one, 


three, or four semesters those students 


same as the national 


credits 
two, 


whose mean scores on the tests ar¢ 


about the same as the national norms 


at the first, second, third or fourth 


led 
4 
— 
piace 
— 
Re 


- 
fer 


| ter ibo il place 
ment. Finally, t ac t tor 
a vear tone coll “ ter) a stu 


dent who offers three years of high 


scho credit, but whos test score 
show re than two year achi 

ment when by the national 
norm. The promoted student is allowed 


extra credits toward the colleges lan 


guage requirements for various degrees 


but not toward graduation. All place 


ments would be given a six weeks 
trial. If at that time a student felt that 
he was musclassihned, upon recommen 


dation of his instructor he was per 
mitted to take an equivalent test and 
be reclassified if the results justified 
the change 

During the thirteen years that the 


placement tests have been used they were 


administered to 7905 students. Of this 
number 6355 or 80.4 per cent received 
normal placement, 1252 or 15.8 per cent 


were advanced and 298 or 3.8 per cent 
were retarded. Fifteen and five tenths pet 
cent of the normal group received A's 
and 3.6 per cent failed. Thirty-nine and 
two tenths per cent of the accelerated 
group received A's and .2 per cent failed 
These data tend to show that the students 
that were advanced received better grades 
than the average for their advanced work 
The author asserts that 11,000 credit 
hours have been saved and made avail 
able for electives. Twenty per cent of the 
retarded group failed and fifty-three per 
cent received grades of A's, B’s and C's 
A total of 767 students were advanced 
one semester, 252 were advanced two 
semesters, 34 were advanced three se- 
mesters asd 8 students were advanced four 
semesters. Of this group 40.1 per cent 
received A's, 43 per cent received B's 
and 15.3 per cent received C’s. In other 


ords af ximatcly Y8 per cent ot 
accelerated group received satista 
grades in toreign languages in com; 
ison with 81 per cent of the stud 
receiving comf{ irable grades trom 
normal group 

Did the students that were advan 
one or more semesters to the more 
erary phases of their language work 
to master the more elementary and b 
aspects of their language training wh 
they skipped? In order to answer t 
question the author administered 
American Council and the Cooperat 
Foreign Language Tests and discov 
that 88 per cent of the advanced stud 


received satisfactory scores as comy 


with 79 per cent for the students t 
were not advanced. Twelve per cent 
the advanced group received unsatisf 
tory scores while 21 per cent of the 
mal group received unsatisfactory ratit 
on the tests. The bulletin further 
forth the value of standardized tests 
the foreign languages for purposes 
prediction, for evaluating levels of 
dent achievement, and for evaluating 
lege instruction. Other questions rela 
to the problem of placement at Wisc 
sin and in other institutions are 
cussed 

The 39 page bulletin affords a con 
summary and evaluation of the langu 
placement program at Wisconsin. 7 


} 


information presented is objective in « 
acter and the interpretation and eva 
tion of the data is non technical. 7 
statistical treatment is couched in 
language of the lay reader. The bullet 
is well organized, highly informative 
presents an interesting factual sumn 
of an extensive program of research 

T. L. TORGERSON 
University of Wisconsin 
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